Scholarly article on topic 'The Effect of Evaluation Strategy and Music Performance Presentation Format on Score Variability of Music Students’ Performance Assessment'

The Effect of Evaluation Strategy and Music Performance Presentation Format on Score Variability of Music Students’ Performance Assessment Academic research paper on "Mechanical engineering"

CC BY-NC-ND
0
0
Share paper
OECD Field of science
Keywords
{"music performance assessment" / "presentation format" / "evaluation strategy" / "music education"}

Abstract of research paper on Mechanical engineering, author of scientific article — Dorina Iusca

Abstract The present study aims to investigate the score variability of music performance evaluation according to two factors: the music performance presentation format (audio versus audio-visual presentation of students’ music performance) and the evaluation strategy (global versus segmented evaluation of students’ music performance). A growing body of literature has previously suggested that these two factors tend to significantly modify the scores of music performance evaluations. We recorded 50 undergraduate music students in standard conditions and we used a panel of expert evaluators in order to assess them by combining two conditions: audio/audio-visual presentation and global/segmented evaluation strategy. Results have shown that the use of segmented scale determines higher ratings for the technical level and lower ratings for the expression of music performance.

Academic research paper on topic "The Effect of Evaluation Strategy and Music Performance Presentation Format on Score Variability of Music Students’ Performance Assessment"

Available online at www.sciencedirect.com

ScienceDirect

Procedia - Social and Behavioral Sciences 127 (2014) 119 - 123

PSIWORLD 2013

The effect of evaluation strategy and music performance presentation format on score variability of music students'

performance assessment

Dorina Iusca

_"George Enescu" University of Arts, 7-9 Horia Street, Iasi, 700126, Romania_

Abstract

The present study aims to investigate the score variability of music performance evaluation according to two factors: the music performance presentation format (audio versus audio-visual presentation of students' music performance) and the evaluation strategy (global versus segmented evaluation of students' music performance). A growing body of literature has previously suggested that these two factors tend to significantly modify the scores of music performance evaluations. We recorded 50 undergraduate music students in standard conditions and we used a panel of expert evaluators in order to assess them by combining two conditions: audio/audio-visual presentation and global/segmented evaluation strategy. Results have shown that the use of segmented scale determines higher ratings for the technical level and lower ratings for the expression of music performance.

© 2014 The Authors. PublishedbyElsevierLtd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.Org/licenses/by-nc-nd/3.0/).

Selectionand peer-reviewunder responsibility of Romanian Society of Applied Experimental Psychology. Keywords: music performance assessment, presentation format, evaluation strategy, music education

* Corresponding author.

E-mail address: dorinaiusca@yahoo.com

1877-0428 © 2014 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).

Selection and peer-review under responsibility of Romanian Society of Applied Experimental Psychology. doi: 10.1016/j.sbspro.2014.03.224

1. Introduction

Assessing students' music performance level is a multifaceted activity. Its results depend not only on the student's musical training but on a variety of other extra-musical elements related to assessment context, evaluators' characteristics or performer's personality features and psychological states.

The educational, artistic and research practice of music performance has revealed a general controversy regarding the evaluation strategies of music performance, showing that there isn't a unanimous consensus between experts referring to the use of global marks or segmented scales. Previous research has also underlined that music performance presentation format (audio versus audiovisual) has a significant effect on the adjudicators' performance ratings, but the results are inconclusive.

The research hypothesis focuses on investigating the influence of evaluation strategy (global versus segmented evaluation of students' music performance) and the performance presentation format (audio versus audiovisual) on score variability. Do any of these situations affect students' scores when evaluation is done by music experts?

2. Theoretical framework and previous research

Global assessment of music performance is defined by the situation where adjudicators make a global evaluation by assigning an overall rank or score that reflects their overall impression from their personally selected implicit or explicit criteria (Wrigley, 2005). By contrast, segmented evaluations involve the use of explicit and clearly defined criteria that usually form a criterion-based rating scale with standard qualities (idem).

A recent study (Stanley et al., 2002) has shown that preferences in using global or segmented evaluations among conservatoire staff are divided. While some examiners feel that using the criteria helps them focus on important assessment issues, others consider segmented evaluation as a way to narrow their view on music performance.

Earlier studies on music performance assessment (Fiske, 1975, 1977, 1983; Burnsed, Hikle & King, 1985; Burnsed & King, 1987 cited in Forbes 1994) have showed that interjudge reliability is higher in the case of global assessment compared to segmented evaluation. Furthermore, the large correlations found between each scale factor and the overall mark exposed the redundancy of evaluation criteria. This is why researchers (Fiske, 1975 cited in Forbes, 1994) suggested that using global assessments may reduce adjudicators' unnecessary efforts by letting them focus more on performance and not on evaluation strategies.

Other voices (Thompson & Williamon, 2 003) have pointed out that global evaluation may have a higher level of ecological validity by maintaining minimum amount of outside intervention into the assessment process. Consequently, some studies (Ryan & Costa-Giomi, 2004; Wapnik et al, 2000; Thompson et al, 2007; Geringer et al, 2009) have adopted the global assessment when trying to adopt a more economic design in research methodology.

However, many researchers (Bergee, 2003; Geringer et al 2009; Zdinski & Barnes, 2002) have questioned global assessment efficiency due to its low standard qualities and have proposed the use of valid and reliable segmented scales. A growing body of literature has showed many descriptions of the concept by identifying up to five factors that form music performance. Here are some examples: intonation, dynamics, phrasing, and tone (Wapnik et al, 1998); tone quality, pitch accuracy, rhythm accuracy, expression and stylistic correspondence (Ryan et al, 2006); tone quality, melodic accuracy, pitch, rhythmic accuracy, tempo, interpretation and technique (Hewitt, 2007); musical elements, instrument mastering and presentation (Ciorba & Smith, 2009); phrasing, intonation, rhythm, dynamics and tone (Geringer & Madsen, 1998). Each of these views of music performance has demonstrated different levels of standardization regarding validity and reliability. However, many approaches describe music performance through two constitutive factors: technique and expression (Griffiths, 2009; Kinney, 2009; Cantwell & Jeanneret, 2004; Gabrielsson, 2003; Thompson & Williamon, 2003). These two factors are usually described by using certain items corresponding to the technical or expressive dimension of music performance. This bi-factorial perspective on music performance is a common practice in both research and artistic areas (Thompson & Williamon, 2003).

During the last years interdisciplinary efforts have been made in order to create valid and reliable music performance assessment segmented scales. Some scales may be applied either to all classical instruments (Thompson & Williamon, 2003; Haroutounian, 2007; Burrack, 2002), to an instrumental group, such as Woodwind Brass Solo Evaluation Form (Saunders & Holahan, 1997 apud Hewitt 2007) and String Performance Rating Scale

(Zdzinski & Barnes, 2002), or to a single musical instrument. The reliability levels differ from one scale to another, but some of them demonstrate high standard qualities.

Experience in the artistic field has shown that the presentation format of music performance (audio versus audiovisual) is also important in performance evaluation. An experiment conducted on violinists (Gillespie, 1997) has revealed that performers were rated lower on vibrato stability when assessed in the audio condition only. The results were confirmed in a Canadian study (Wapnick et al, 2004) where piano performances were rated higher when the audio performance was accompanied by the visual image of the soloist. Other researches (Wapnick et al, 1998, 2000) underline as well the adjucators' tendency of over-estimating the music performances presented in the audiovisual condition.

3. Method

The aim of the study was to observe the score variability of music performance assessment by combining two conditions: when students were evaluated globally / by scale and when recording were evaluated in audio only format / in audio-video format.

Fifty undergraduate music students were recorded in standard conditions and evaluated by four music experts (a flute player, a cellist, a composer and a conductor). The evaluators were all university professors. The performers were unknown to the evaluators. Performers were either string players (violin, viola, cello) or woodwind players (flute, oboe, clarinet and bassoon). They were asked to perform two instrumental fragments of their choice taken from performers' repertory: one aimed to reveal their technical abilities and another one aimed to uncover their musical expression. Subjects decided on performing classical compositions. The fragments were previously studied, so the students performed them from memory.

Each recording varied from 1 to 5 minutes. In the end we acquired 100 recordings (two for each performer) audio-video format. Later on we converted the audio-video performances into audio only recordings.

The experts assessed the performers' group four times, with one day pause between evaluation sessions, in order to reduce the learning effect. During the first two sessions, adjudicators assessed the audio presentations (one through global evaluation and another one using a segmented scale) and during the last two sessions, they evaluated the audiovisual presentations (also using global and segmented assessment).

In the case of segmented evaluation, we utilized a rating scale reflecting the factorial model developed by Brian Russell (Russell, 2010). The scale is designed to be used for strings, woodwind and voice, and measures music performance by assessing technique and expression. Each of the two dimensions includes factors such as tone, intonation, rhythmic accuracy, articulation (to illustrate technique) and tempo, dynamics, timbre, interpretation (for describing expression). The Romanian version of the scale showed a consistency of 0.93.

4. Results and discussion

A two-way ANOVA Test of Within-Subjects Effects was calculated for three rounds of scores related to the dependent varible: one for the general level of music performance, one for the technical level of music performance and one for music expression level. The purpose was to calculate the effect of two independent variables: measurement type (segmented versus global) and presentation format (audio versus audio-video) on music performance score variability. In the case of the general level of music performance, there was no significant effect of measurement type [F (1,49) = 1.56, p = 0.217], presentation format [F (1,49) = 0.50, p = 0.483] or interactions between measurement type and presentation format on music performance score variability [F (1,49) = 0.01, p = 0.908]. In the case of the technical level of music performance we found a significant effect of measurement type on music performance scores [F (1,49) = 19.58, p = 0.000]. The presentation format had virtually no impact on score variability [F (1,49) = 0.000, p = 1.000]. Although the interaction between measurement type and presentation format was not significant for the technical level of music performance [F (1,49) = 2.66, p = 0.109], the scores were higher in the case of segmented evaluation especially in the audio condition. For the expression level of music performance, there is a significant effect of measurement type on music performance variability [F (1,49) = 8.93, p = 0.004]. The presentation format showed no significant effect [F (1,49) = 1.095, p = 0.301, and the interaction

between the two independent variables had no significant effect [F (1,49) = 1.692, p = 0.199]. When comparing the means related to the technical level of music performance, we can observe that segmented measurement (M = 5.38) offers higher scores than global measurement (M = 4.99). By contrast, the means related to the expression level of music performance shown higher scores for global measurement (M = 5.43) compared to segmented measurement (M = 5.12).

One main finding of the present research is that the presentation (audio only versus audio-visual) format had no significant effect on score variability, no matter the dimensions of music performance (general, technical or expression). More than that, in the case of the technical level, the results obtained in the audio only condition and those obtained in the audio-video condition were practically identical. In other words, the music experts we used as evaluators were not influenced by the performers' image, and this may be due to their extensive experience in the musical field. All for evaluators are important figures for the musical domain; consequently, their professionalism can explain this finding. The results are opposite to previous findings (Wapnick et al, 1998, 2000, 2004; Gillespie, 1997; Ryan & Costa-Giomi, 2004; Ryan et al, 2006) which found significant differences in scoring the same performance in audio only and audio-visual condition. Another important result is that, when evaluated with the segmented scale, the music performers obtained higher scores for the technical level of music performance and lower ratings for music expression. In the case of the general level of music performance this difference is not significant. We explain this finding by analyzing the different expectations our four music experts have regarding the two dimensions of music performance (technique and expression). It seems that musical technique represents more for these experts than the items described in the scale (tone, intonation, rhythmic accuracy and articulation). By opposite, music expression may mean less for our evaluators than the items in the scale (timbre, dynamics, tempo and interpretation). As a result, evaluators gave lower ratings for technique and higher ratings for expression when they used their own personal assessment criteria of instrumental music performance.

A series of educational implications may be related to the present study's findings. During their development as music performers, from a very young age, students rely on frequent assessments done by their instrumental music teachers in order to create a coherent image of what a good performance is. In the same time, these assessments rarely include specific criteria generally accepted by all music experts, as music performance is often considered to have an important subjective nature. This study draws attention to the necessity of discussing and accepting these criteria between music evaluators, in order to help performers create a more organized strategy of instrumental training.

5. Conclusions

The present study found that instrumental music performance assessment may not necessarily be influenced by the performers' image, as our four music experts (a flutist, a cellist, a composer and a conductor) rated audio only and audio-visual recordings of the same musical material with similar scores. Also, when referring to the two main dimensions of music performance the scores were significantly different for the global and segmented evaluations, due to experts' higher expectations related to technique and lower images related to expression. This fact may be confusing for students who try to create a coherent picture about what music performance represents.

References

Bergee, M.J. (2003). Faculty Interjudge Reliability of Music Performance Evaluation. Journal of Research in Music Education, 51(2), 137-150;

Burnsed, V., & King, S. (1987). How Reliable Is Your Festival Rating? Update: Applications of Research in Music Education, 5(3), 12-13;

Burnsed, V., Hinkle, D., & King, S. (1985). Performance Evaluation Reliability at Selected Concert Festivals. Journal of Band Research, 21(1), 22-29;

Burrack, F. (2002). Enhanced Assessment in Instrumental Programs. Music Educators Journal, 88(6), 37-32;

Cantwell, R. H. & Jeanneret, N. (2004). Developing a Framework for the Assessment of Musical Learning: Resolving the Dilemma of the "Parts" and the "Whole". Research Studies in Music Education, 22, 2-13;

Ciorba, C. R., & Smith, N.Y. (2009). Measurement of Instrumental and Vocal Undergraduate Performance Juries Using a Multidimensional Assessment Rubric. Journal of Research in Music Education, 57(1), 5-15;

Fiske, H. E. (1975). Judge-Group Differences in the Rating of High School Trumpet Performances. Journal of Research in Music Education, 23(3), 186-189;

Fiske, H. E. (1977). The Relationship of Selected Factors in Trumpet Performance Adjudication. Journal of Research in Music Education, 25(4), 256-263;

Fiske, H.E. (1983). Judging Musical Performances: Method or Madness? Update, 7-10;

Forbes, G. (1994). Evaluative Music Festivals and Contests -Are They Fair?. Update: Applications of Research in Music Education, 1, 16-20;

Gabrielsson, A. (2003). Music Performance Research at the Millennium. Psychology of Music, 31(3), 221-272;

Geringer, J. M., & Madsen, C. K. (1998). Musicians' Ratings of Good versus Bad Vocal and String Performances. Journal of Research in Music Education, 46(4), 522-534;

Geringer, J. M., Allen, M. L., MacLeod, R. B., & Scott, L. (2009). Using a Pre-screening Rubric for All-State Violin Selection: Influences of Performance and Teaching Experience. Update: Applications of Research in Music Education, 28(1), 41-46;

Gillespie, R. (1997). Ratings of Violin and Viola Vibrato Performance in Audio-Only and Audiovisual Presentations. Journal of Research in Music Education, 45(2), 212-220;

Griffiths, N. K. (2009). "Posh Music Should Equal Posh Dress": An Investigation into the Concert Dress and Physical Appearance of Female Soloists. Psychology of Music, 1-19;

Haroutounian,J.(2007).Perspectives of Musical Talent: A Study of Identification Criteria and Procedures. High Ability Studies,11(2),137-160;

Hewitt, M. P. (2007). Influence of Primary Performance Instrument and Education Level on Music Performance Evaluation. Journal of Research in Music Education, 55(1), 18-30;

Kinney, D. W. (2009). Internal Consistency of Performance Evaluations as a Function of Music Expertise and Excerpt Familiarity. Journal of Research in Music Education, 56(4), 322-337;

Ryan, C., & Costa-Giomi, E. (2004). Attractiveness Bias in the Evaluation of Young Pianists' Performances. Journal of Research in Music Education, 52(2), 141-154;

Ryan, C., Wapnick, J., Lacaille, N., & Darrow, A. A. (2006). The Effects of Various Physical Characteristics of High-Level Performers on Adjudicators' Performance Ratings. Psychology of Music, 34(4), 559-572;

Saunders, T. C., & Holahan, J. M. (1997). Criteria-Specific Rating Scales in the Evaluation of High School Instrumental Performance. Journal of Research in Music Education, 45, 259-272;

Stanley, M., Brooker, R., & Gilbert, R. (2002). Examiner Perceptions of Using Criteria in Music Performance Assessment. Research Studies in Music Education, 18(1), 46-56;

Thompson, S., & Williamon, A. (2003). Evaluating Evaluation: Musical Performance Assessment as a Research Tool. Music Perception, 21(1), 21-41;

Thompson,S.,Williamon, A., & Valentine, E.(2007).Time-Dependent Characteristics of Performance Evaluation. Music Perception,25(1),13-29;

Wapnick, J., Mazza, J. K., & Darrow, A. A. (1998). Effects of Performer Attractiveness, Stage Behaviour, and Dress on Violin Performance Evaluation. Journal of Research in Music Education , 46(4), 510-521;

Wapnick, J., Mazza, J. K., & Darrow, A. A. (2000). Effects of Performer Attractiveness, Stage Behaviour, and Dress on Children's Piano Performances. Journal of Research in Music Education , 48(4), 323-336;

Wapnick, J., Ryan, C., Lacaille, N., & Darrow, A. A. (2004). Effects of Selected Variables on Musicians' Ratings of High-Level Piano Performances. International Journal of Music Education, 22(1), 7-20;

Wrigley, W. J. (2005). Improving Music Performance Assessment. Griffith University;

Zdzinski, S. F., & Barnes, G. V. (2002). Development and Validation of a String Performance Rating Scale. Journal of Research in Music Education, 50(3), 245-255;