«A dissertation submitted to the Department Of Computer Science, Faculty of Science at the University Of Cape Town in partial fulfilment of the ...»
Secondly the video clips taken from the DVD were letterboxed (Figure 3-9) down to the test resolutions, instead of being cropped (Figure 3-10) to the desired resolution. The DVD material was shot at a wide screen aspect ratio of 16 x 9, while the cell phone screen has of aspect ratio of 4 x 3.
In addition to the screen real estate wasted on black bars, the wide screen format included extraneous background that was never used by the signer in the video. A better technique would have been to crop the DVD material to the required resolutions resulting in the full screen of the cell phone used to show the signer with no space wasted on black bars or background, better fitting the available screen area to the signing space. This would have also simulated the video captured by the cell phone more accurately.
The last factor that could have influenced the responses from the participants is unrelated to the quality of the video clips, and is the actual content of the video clips. To evaluate the intelligibility of the video clips, questions were asked about the content of the video clips and if the participants understood what was communicated in the clip through SASL. But what if the participant could clearly distinguish the face and hand movements in a video clip but actual signs used in the video clip were unfamiliar to the participant? This would have had a negative effect on the participant’s rating of intelligibility of that video clip.
4 Follow-up Pilot User Study (Experiment 2)
4.1 Aim The aim of the follow-up pilot study was to check the compression issue, eliminating the impact of limited bit rate on higher resolutions and frame rate video clips, as well as validate the new questionnaire incorporating the lessons learned in the initial pilot study (see Appendix B).
Through the experience and findings of the pilot study the following changes were made to the
experimental setup in the follow-up study:
To remove as much video quality degradation due to video compression artefacts and get as close as possible to uncompressed video all clips were compressed at a data rate of 5 000 kbits/s.
To more accurately resemble the video that would have been captured on the cell phone itself when using the cell phone as a video based Sign Language communications medium, all clips were cropped (Figure 3-10) instead of being letterboxed (Figure 3-9). This removed extraneous background area, better fitting the available screen area to the signing space of the signer, making full use of the cell phone’s screen area and keeping the resultant video clips at the same aspect ratio as resolutions being tested.
To encourage the participants to focus more and give more detailed feedback throughout, the number of clips were reduced by including only one clip from each resolution-frame rate combination. In addition the questions were simplified.
In the pilot study a few comments mentioned the video being “not clear”. To attempt to expand on these comments the two questions in the pilot study covering the details of the video were extended to five questions, to include motion blurring, the speed of the video in addition to the facial and hand detail visibility.
And lastly in an attempt to factor in the possible unfamiliarity of the actual signs used in the video clips the participants were allowed to view the clip as many times as desired before and while finishing the questionnaire. The number of times a clip was viewed was captured on the questionnaire and a question was added to specifically investigate this factor.
4.2.1 Participants Six adult members of the Deaf community (three women, three men) ranging in age from 33 to 64 (mean = 38) participated in this study. All were native signers and have used SASL as their principle mode of communications all their lives. The six participants were all staff members of DCCT, and had English as their language of literacy, regardless of what their hearing families used.
Three had taken part in the first pilot study.
All participants were introduced to the experiment and each signed a consent form to confirm that they fully understand the project, agree to participate and understand that all information provided would be kept confidential.
4.2.2 Experimental Setup The follow-up experiment was conducted in the same high ceilinged, open venue as the pilot study.
The six participants were seated at similar desks arranged in a half circle, two participants to a desk, with an individually numbered pack of six questionnaires, a pen, as well as a Nokia N96 cell phone, preloaded with the corresponding video clips in front of each participant.
All communications between the researcher and participants were interpreted by a certified SASL interpreter who was known to the participants. Although the questionnaires were explained in SASL and all queries were answered through the SASL interpreter, the questionnaires were provided in written English and answered in written English.
The participants were introduced to the experiment with the help of the SASL interpreter. It was made clear during the introduction that the focus of the experiment was on evaluating the quality of the video clips and the intelligibility of the SASL in the video clips at different quality settings, and not to evaluate the participants’ proficiency in SASL.
Seeing that written/spoken language is not the participants’ first language, and the questionnaire required the participants to write down what they understood the Sign Language video clip contained, all participants were asked if they are comfortable writing their answers out. They were given the option of giving their responses to the questionnaire through the interpreter. None of the participants took this option, and indicated that they were comfortable with writing down their responses in English.
In the first experiment some of the questions had to be explained while the participants were answering the questionnaires, in addition there were some confusion in finding clips as well as how many times a clip were to be viewed. In an attempt to alleviate these problems and make sure each completed questionnaire were completely reliable, a practice video clip, similar in look and difficulty to those used in the experiment, and practice questionnaire, identical to the questionnaire used in the experiment, were added to the experiment. This was done to enable the participants to familiarise themselves with the phone and questionnaire. This provided them with an opportunity to ask for clarification on any of the questions, as well as the use of the cell phone as they worked through the practice questionnaire. After an introduction to and demonstration of the cell phone, the participants were asked to view the practice clip on the cell phone and answer the separate loose practice questionnaire. The answers to practice questionnaire was not captured or used in the experiment.
After all the participants finished the practice evaluation in their own time, it was confirmed with each participant individually that they were comfortable with the questionnaire and could select and play any of the video clips, and move to the next video clip without problems. They were then given the go ahead to start filling in the questionnaires evaluating the six video clips. The participants were allowed to view any clip as many times as they wanted to, with a count of the views noted on the questionnaire.
4.2.3 Cell phones The same set of Nokia N96 cell phones that were used in the pilot study were used in this experiment and again it was left up to the participants to decide how the cell phones would be held while viewing the clips. As in the pilot study all participants held the phone in the default portrait orientation, at a distance comfortable to each individual participant. Same held the phone in their hand while some preferred the phone lying flat on the table while viewing a video clip.
4.2.4 Video clips Six video clips were used, each showing the same sign language user in the same environment, with consistent lighting, background and distance from camera, signing in SASL.
The same two resolutions as were used in the pilot study were used in the follow-up study,
320 x 240 (QVGA) 174 x 144 (3GP) Similarly the same three frame rates as in the pilot study were used in the follow-up study,
30 frames per second 15 frames per second 10 frames per second Where in the pilot study the video was taken directly from the widescreen DVD material, resized to the desired resolutions by letterboxing, in this experiment the video clips was cropped before being taken to the desired resolution. This made sure no space on the cell phone screen was wasted with black bands or unused background area, making much better use of the available screen resolution, better fitting the available screen area to the signing space of the signer. And giving an accurate simulation of the screen real estate usage as would be the case when the phone was used for video communication.
These six clips were acquired from a DVD, as MPEG-4 files at full resolution and frame rate, and at best possible quality. Each of the clips were cropped and recompressed to the required resolution and frame rate, using the Export (Using QuickTime conversion) feature of Final Cut Express (v4.0.1).
A data rate of 5000 kbits/sec was used to minimise the impact of the video compression on the quality of the resulting video clip.
The basic details of the six video clips are shown in Table 4-1. The full details of the video clips, including the data rate, file size and duration of each of the video clips are available in Table B-7, in Appendix B.
Five sets of clips, one set per participant, were then created from the six prepared clips. Each set contained the same six clips but in a different random order. The randomizing was done using Microsoft Excel.
Video No Resolution (w x h) Frames per second Signed phrase 1 320 x 240 30 The girl rides the horse.
2 320 x 240 15 The man bounces the ball on his head.
3 320 x 240 10 The small boy is dirty all over.
4 176 x 144 30 Tomorrow is my birthday.
5 176 x 144 15 Yesterday I caught a big fish.
6 176 x 144 10 Your T-shirt is too small for you.
Table 4-1: Experiment 2 - Video clip specifications The order of the clips was randomised to minimise the possibility that the participants could assume the next clip would be of better quality than the previous. In addition the clips were randomised between sets to ensure that there was no accidental influence between participants on the quality evaluation of the clips.
These five sets of six randomly ordered video clips of differing quality was then copied one set per cell phone to five Nokia N96 cell phones. Other than the filenames of the six files, there was no difference between the phones, the files or how the videos were viewed by the users.
As six participants and only five phones were available on the day of the experiment, the fifth phone was shared between participant E and F. Thus the clip order for video clip set E and video clip set F was identical. However, close observation showed that the two participants looked at the clips and answered the questionnaire completely separately.
4.2.5 Questionnaire Each set of questionnaires, as shown in Appendix B, contained a cover page explaining the purpose of the experiment and provided a summary of the experimental procedure.
For each video clip to be evaluated a questionnaire was attached. All answers were captured, but the answers to the freeform questions were not assigned a numeric value, while the answers to the five scale questions were assigned a numeric value. The more acceptable the video, the higher the value assigned to the answer.
In addition to the six clips to be evaluated, a practice video clip was added. This clip and a separate loose questionnaire sheet were used to explain and familiarise the participants with playing the video clips, understanding the questionnaire and answering the questionnaire. When all participants felt comfortable with the phone and the questionnaire, they were given to go ahead to evaluate the six video clips.
Question 1 What was said in this video?
As in the pilot user study, this question served two purposes. The first was to encourage the participant to pay attention to what is being said in the video, and concentrate on understanding what is said in the video. The second was to get an idea of how close to the original phrase the participant’s understanding of the message was.
The answer to this question was captured, but no numeric value was assigned to the answer.
Question 2 How sure are you of your answer to Question 1 above?
Possible answer completely sure sure so-so not sure not sure at all The second question aims to provide a numeric value to the comprehensibility of the sign language in the video clip. This question functions in conjunction with question 1, and provides an opportunity to check the participants answers. If the participant correctly wrote down the signed phrase in question 1, the answer to this question should show the participant sure of his answer.
This question was assigned a numeric value, with completely sure given a value of 5, down to 1 for not sure at all.
Question 3 How easy or how difficult was it to understand what was said in this video?
Possible answer very difficult difficult average easy very easy Question 3 was kept as is from the pilot study and is included as a further check of intelligibility, this time changing the wording as well as order of values, to help to confirm the participant’s ability to understand the contents of the video clip. The first three questions should correlate closely and if all three point in the same direction give a good indication of the intelligibility of the sign language contents at the given resolution and frame rate.
This question was assigned a numeric value, with very easy given a value of 5, down to 1 for very difficult.
Question 4 Please select the appropriate choice from the options provided below.
From the results of the pilot study it was decided to simplify, but also broaden the evaluation of the different aspects of the video quality from the perspective of the Deaf user.
In the pilot study quite a few comments mentioned blurry motion and the speed of the video.