WWW.THESIS.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Thesis, documentation, books
 
<< HOME
CONTACTS



Pages:     | 1 |   ...   | 2 | 3 || 5 | 6 |   ...   | 10 |

«A dissertation submitted to the Department Of Computer Science, Faculty of Science at the University Of Cape Town in partial fulfilment of the ...»

-- [ Page 4 ] --

ITU-T Series H Supplement 1 describes the factors to be taken into account when low bit-rate video is used for Sign Language and lip-reading telecommunications. The document sets out performance requirements that should be met to ensure a successful person-to-person conversation using a video communication system. In setting the requirements, video compression is ignored and the focus is on the resolution and frame rate. The stated requirements though should not be taken as fixed and absolute, but depending on the situation may need to be more stringent, or more relaxed.

The document shows that 20 frames per second provide good usability for both sign language and lip-reading, while still understandable at 12 frames per second. Between 8 and 12 frames per second usability becomes very limited, with no practical usefulness below 8 frames per second.

When looking at resolution for person-to-person sign language video communication, Supplement 1 concludes that it is possible to use Quarter Common Intermediate Format (QCIF) (176 x 144 pixels) resolution, with an increase to CIF (352 x 288 pixels) giving better language perception. Sub Quarter Common Intermediate Format (SQCIF) (112 x 96 pixels) is too coarse for reliable perception, with some signs occasionally perceivable.

The application profile concludes with the basic performance goal of aiming for 25-30 frames per second at CIF (352 x 288 pixels) resolution, while if needed in very low bit-rate environments dropping the frame rate to 12-15 frames per second at a resolution of 176 x 144 pixels.

2.6.2 Subjective and Objective evaluation of video quality When looking at Sign Language communications over limited bandwidth communications channels such as the cellular telephone network an appropriate quality measurement is needed to compare different video parameters. In a subjective evaluation video sequences are shown to a group of viewers The viewers opinion of the video material is then captured, assigned a numeric value and averaged to provide a quality measurement for the video sequence. The details of the testing can vary depending on the objective of the testing and the aspect of the video that is being evaluated.

Objective video quality metrics are mathematical models that approximate results of subjective quality assessments as closely as possible. Video quality metrics such as mean square error (MSE) and peak signal-to-noise ratio (PSNR) are the most widely used objective measures for evaluating video. These measurement techniques though are focused on traditional quality in terms of aesthetics. Recent objective quality measures, modelling the human visual system, have shown substantial improvements over MSE and PSNR in predicting aesthetic quality. But as Ciaramello et al. [6] state, sign language video is a communications tool, and quality must be judged in terms of intelligibility.

Ciaramello et al. [6] demonstrated that PSNR is not a good measure of intelligibility in Sign Language video material, and proceeded to propose and evaluate a metric based on the spatial structure of ASL and as a function of MSE in both the hands and the face. The proposed metric gave a substantial improvement over PSNR.

The user experience of MobileASL was evaluated in a laboratory setting, with both subjective and objective measures [5]. The subjective measurements were done in a conversational setting, with two participants conversing in Sign Language using cellphones. The quality of the video was measured subjectively by how hard or easy it was to understand. This was done through a 5

question questionnaire. The survey questions were the following:

1. During the video, how often did you have to guess what the signer was saying (where 1 is never and 5 is all the time)?

2. How difficult would you say it was to comprehend the video (where 1 is very easy and 5 is very difficult)?

3. Changing the frame rate of the video can be distracting. How would you rate the annoyance level of the video (where 1 is not annoying at all and 5 is extremely annoying)?

4. The video quality over a cell phone is not as good as video quality when communicating via the Internet (e.g., by using a web cam) or over a set top box. However, cell phones are convenient since they are mobile. Given the quality of conversation you just experienced, how often would you use the mobile phone for making video calls versus just using your regular version of communication (e.g., go home to use the Internet or set top box, or just text)?

5. If video of this quality were available on the cell phone, would you use it?

The objective measure of the video quality was made through a count of the number of repair requests, for each repair request the number of times the requester asked for a repeat was counted, as well as a count of conversational breakdowns. This was all calculated from the videotaped user study sessions, during which participants were having conversations using phones set on a table in front of them.

As Nakazono et al. [19] state, in evaluating Sign Language video we must evaluate how well the linguistic information is transmitted and should be careful not to be swayed by impression of the appearance of the video. They used two kinds of evaluations, the intelligibility test and the opinion test.

In the intelligibility test a short video sequence of sign language is presented to subjects, subjects are instructed to write down the contents of the sentences, dictated sentences are then evaluated from 0 to 3, keeping in mind to be careful not to be affected by the difference in subjects’ ability in written language.





In the opinion test a short video sequence of sign language is presented to subjects, subjects were asked to evaluate the intelligibility of the sign language at five levels, from 1 to 5, and the mean value of the score is used for the evaluated value of the data. In the above study subjects were asked to evaluate the intelligibility of the sign video, and not to evaluate the preference of picture quality.

Ciaramello et al. [7] used a four-question, multiple-choice survey given on a computer at the end of each video in their subjective sign language video evaluation. The first question, “What was the name of the main character in the story?” was asked to encourage the participants to pay close attention to the contents of the video, and was not used in any statistical tabulation. The second question was “How difficult would you say it was to comprehend the video?” with five possible answers: very easy (1.00), easy (0.75), neither easy nor difficult (0.50), difficult (0.25) and very difficult (0.00). The third question asked “How would you rate the annoyance level of the video?” this time with four possible answers: not at all annoying (1.00), a little annoying (0.66), somewhat annoying (0.33) and lastly extremely annoying (0.00). The fourth question asked of the participant would use a video cell phone at this video quality. The subjective intelligibility and annoyance ratings for each video were calculated for each video by averaging each participant’s answers to the two questions.

2.7 Summary Sign Language being a visual language, conveying meaning through a combination of hand shapes, movement of hands and arms, in addition to facial expressions, requires a visual telecommunication channel, making video the only appropriate means of first language telecommunications for the Deaf community.

Video quality can be evaluated either subjectively, capturing viewers opinion of video material, or objectively, using mathematical analysis of the video. The objective evaluations, although good at predicting perceived quality in terms of aesthetics, are not as applicable to quantifying the intelligibility of video material as a lot more is involved than purely if it looks good. In addition Sign Language is not a single language but has many variations across the world, as well as different dialects within the same sign language, such as SASL.

Video communications using mobile phones provides three main challenges, low bandwidth, low processing speed and limited battery life. In an attempt to overcome these challenges Sign Language specific video compression techniques have been investigated, but these techniques rely on modified versions of the standard video encoders to provide better compression, and this is not possible to implement on all phones, especially at the lower end of the market (the target audience of this research).

This research is not focused on video compression schemes, but on the effect of the reduction of video resolution and frame rate on the intelligibility of video containing SASL. The objective is to evaluate the intelligibility of the sign language video, not the picture quality of the video.

3 Pilot user study (Experiment 1) Based on the ITU requirements and limitations (see Section 2.8.1), and the aim of subjective evaluation of Sign Language video on a cell phone a pilot study was conducted to validate the questionnaire with the Deaf participants for evaluating the intelligibility of SASL video on a cell phone (see Appendix A).

3.1 Aim The pilot user study aimed to validate the questionnaire with the Deaf participants for evaluating the intelligibility of SASL video on a cell phone, to uncover any problems with the planned experimental setup. Reducing the video resolution and frame rate is the simplest way to reduce video file size, and thus the required amount of data to transfer over the cell phone network. This experiment only looked at the impact of video resolution and frame rate, keeping compression constrained to 256 kbps in all of the test videos.

3.2 Background The size of a video file is determined by three basic settings: the video resolution (spatial resolution), video frame rate (temporal resolution) and how the video has been compressed.

3.2.1 Video Resolution Video resolution is the size (width and height) of the frames in the video. The lower the resolution the less detail in the video content and the less storage is needed per video frame.

This experiment will be looking at two resolutions, namely:

 320 x 240 (Quarter Video Graphics Array (QVGA))  174 x 144 (3GP) The resolution of 352 x 288, although an industry standard resolution for video compression and used for capturing video on cell phones, is a higher resolution than the physical screen on the cell phones used can display and was for this reason dropped from the study. It would have been nice to go above 320 x 240, but the standards for cheaper cell phones meant this was not feasible.

3.2.2 Video Frame Rate The video frame rate is the number of frames of video stored and displayed per second. The lower the frame rate the less storage is needed per second of video, but at lower frame rates less detail is visible of objects in motion and blurring of the image starts occurring, which can become a problem especially in Sign Language.

This experiment will be looking at the following three frame rate values:

 30 frames per second  15 frames per second  10 frames per second 3.2.3 Video Compression Video compression is used to process the frames of the video, at the given resolution and frame rate, to further reduce the amount of storage required by the video. The size reduction and resulting quality of the final video is dependent on not only which video compression algorithm was used, but also which compression and quality settings were used. But in general the more the video is compressed the lower the quality and the smaller the file size.

In this experiment video compression was kept to a minimum and consistent throughout the twelve video clips, to be able to see only the impact that resolution and frame rate has on the size and the intelligibility of the video.

3.3 Procedure 3.3.1 Participants Five adult members of the Deaf community (five men, no woman) ranging in age from 33 to 46 (mean = 36) participated in this study. All were native signers and have used SASL as their principal mode of communications all their lives. The five participants were all staff members of DCCT, and had English as their language of literacy, regardless of what their hearing families used.

All participants were introduced to the experiment and each signed a consent form to confirm that they fully understand the project, agree to participate and understand that all information provided would be kept confidential.

3.3.2 Experimental setup The participants were gathered in high ceilinged, open room with fluorescent lighting and windows on one side. They were seated at desks arranged in a half circle, two participants to a desk, with a pack of 12 questionnaires each numbered with A1-A12, B1-B12, and so forth, a pen, as well as a Nokia N96 cell phone preloaded with the correspondingly numbered video clips in front of each participant.

All communications between the researcher and participants were interpreted by a certified SASL interpreter who was known to the participants. Although the questionnaires were explained in SASL and all queries were answered through the SASL interpreter, the questionnaires were provided in written English and answered in written English.

The participants were introduced to the experiment with the help of the SASL interpreter. It was made clear during the introduction that the focus of the experiment was on evaluating the quality of the video clips and the intelligibility of the SASL in the video clips at different quality settings, and not to evaluate the participants’ proficiency in SASL.

Seeing that written/spoken English is not the participants’ first language, and the questionnaire required the participants to write down what they understood the Sign Language video clip contained, all participants were asked if they are comfortable writing their answers out. They were given the option of giving their responses to the questionnaire through the interpreter. None of the participants took this option, and indicated that they were comfortable with reading the questionnaire and writing down their responses in English.

The participants were asked to view each video clip only once and then finish the questionnaire for that clip, without reviewing the clip, rating the intelligibility of that video clip. This was done to get the participants initial response to the video clip, and not allow the participant to try and review sections of the clip that were unclear. If any sections were unclear that should be reflected in the answers for that clip.



Pages:     | 1 |   ...   | 2 | 3 || 5 | 6 |   ...   | 10 |


Similar works:

«Words and Worlds: Transculturalism, Translation, Identity A NORDFORSK Symposium Arranged by the Nordic Network of Literary Transculturation Studies Helsinki, Finland 26-28.8.2011 Book of Extended Abstracts Edited by Jopi Nyman University of Eastern Finland Joensuu 2011 PROVISIONAL PROGRAMME Friday 26 August 9.30 Opening 10.00 Morning coffee 10.30 Keynote Lecture: Professor Harish Trivedi (University of Delhi): Translation and the Postcolonial: Gandhi, Fanon and Rushdie 12-13 Lunch 13-15 Working...»

«Understanding, Using, and Connecting Representations (Draft) Clement 1 A Model for Understanding, Using, and Connecting Representations1 Lisa Clement San Diego State University Scenario 1 Teacher: Can you solve this problem? (Gives student paper with 4 – 8 written at the top.) Student: (Writes 8 as answer). Three-eighths. I subtracted 1 from 4, and then kept the denominator, 8, the same. Teacher: Suppose you had four large brownies and you ate one eighth of one brownie. How many brownies...»

«“THE NEW DXER’S HANDBOOK” © ВТОРОЕ ИЗДАНИЕ Автор: BRYCE K. ANDERSON, K7UA 10 января 2015 г. Перевод: Виктор Гончарский US5WE/K1WE Copyright 2010, 2011 & 2015 The author grants permission to individuals to reproduce this document for personal non-commercial use under the condition that credit is given to the author.СОДЕРЖАНИЕ: Комментарии автора: Глава 1. Слушать – ключ к успеху в...»

«STATEMENT OF ADDITIONAL INFORMATION LSV VALUE EQUITY FUND (Institutional Class: LSVEX) (Investor Class: LVAEX) LSV CONSERVATIVE VALUE EQUITY FUND (Institutional Class: LSVVX) (Investor Class: LVAVX) LSV SMALL CAP VALUE FUND (Institutional Class: LSVQX) (Investor Class: LVAQX) LSV U.S. MANAGED VOLATILITY FUND (Institutional Class Shares: LSVMX) (Investor Class Shares: LVAMX) LSV GLOBAL MANAGED VOLATILITY FUND (Institutional Class Shares: LSVFX) (Investor Class Shares: LVAFX) LSV GLOBAL VALUE...»

«Public Pensions, Public Budgets, and the Risks of Pension Obligation Bonds Thad Calabrese1 Copyright 2010 by the Society of Actuaries. All rights reserved by the Society of Actuaries. Permission is granted to make brief excerpts for a published review. Permission is also granted to make limited numbers of copies of items in this monograph for personal, internal, classroom or other instructional use, on condition that the foregoing copyright notice is used so as to give reasonable notice of the...»

«72-5496-250 12/07 Lionel Texas Special StationSounds Diner Owner’s Manual featuring ® and Congratulations! C ongratulations on your purchase of the Lionel Texas Special StationSounds Diner. This passenger car is equipped with realistic StationSounds and authentic interiors with figures. The Lionel StationsSounds Diner is designed for use on conventional or Command layouts. It will negotiate most O-54 layouts, but O-72 or wider is recommended for optimum performance. The RailSounds effects...»

«AR7350 User Guide AR7350 Bedienungsanleitung Guide de L’utilisateur AR7350 AR7350 Guidea Dell’utente DE HINWEIS Alle Anweisungen, Garantien und andere zugehörige Dokumente können im eigenen Ermessen von Horizon Hobby, LLC jederzeit geändert werden. Die aktuelle Produktliteratur finden Sie auf horizonhobby.com unter der Registerkarte „Support“ für das betreffende Produkt. Spezielle Bedeutungen Die folgenden Begriffe werden in der gesamten Produktliteratur verwendet, um auf...»

«Visión de Desarrollo Bajo en Deforestación para la Amazonía Colombiana Noviembre de 2015 Prólogo El pueblo Colombiano continúa trabajando por un mejor futuro para todos. Por primera vez en décadas, la paz se encuentra al alcance. Nuestro desarrollo social y económico avanza y las empresas Colombianas están creciendo y creando empleos en todo el país. A la vez, dicho progreso proporciona oportunidades sin precedentes para la próxima etapa en el desarrollo de Colombia – y como...»

«Recent Publications Aktuelle Publikationen 2009–2011 Im Buchhandel erhältlich At bookstores Printausgabe beim MPIfG bestellen Order printed copies from the MPIfG Kostenloser Download der PDF-Datei über www.mpifg.de Download free PDF file from www.mpifg.de MPIfG Books | 3 MPIfG Discussion Papers | 20 MPIfG Working Papers | 24 MPIfG Journal Articles | 27 June 2011 Diese Broschüre gibt einen Überblick über die Neuerscheinungen in den Publikationsreihen des Max-Planck-Instituts für...»

«The Effect of College Education on Mortality Kasey Buckles, University of Notre Dame and NBER Andreas Hagemann, University of Michigan Ofer Malamud, University of Chicago and NBER Melinda Morrill, North Carolina State University Abigail Wozniak, University of Notre Dame, NBER and IZA September 2015 ABSTRACT We exploit exogenous variation in years of completed college induced by draft-avoidance behavior during the Vietnam War to examine the impact of college on adult mortality. Our estimates...»

«Be-In-Be-Out Payment Systems for Public Transport Final Report July 2009 This report has been produced by GWT-TUD GmbH under a contract with the Department for Transport. Any views expressed in this report are not necessarily those of the Department for Transport. © Queen's Printer and Controller of HMSO 2009. All enquiries relating to the copyright in the work should be addressed to HMSO, The Licensing Division, St Clements House, 2-16 Colegate, Norwich, NR3 1BQ. 090902...»

«Working with Strings in S7-SCL 1 Structure of a string The Datatyp STRING defines a string of maximal 254 characters. The field reserved for a standard string is 256 Bytes. This space is required in order to save 254 characters and a string-head of 2 bytes. The string-head contains the maximal and the actual length of the string. In a string all characters of the ASCII-Code can be used. A string can even contain special characters like control keys and non printable characters. 2 Declaration In...»





 
<<  HOME   |    CONTACTS
2016 www.thesis.xlibx.info - Thesis, documentation, books

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.