Home > Seeing What is Said: New Methods for Extracting Information from Speech

Seeing What is Said: New Methods for Extracting Information from Speech


 
 

Perceptions of Charisma from Spoken Language in Standard American English and Palestinian Arabic 

Julia Hirschberg

Columbia University

Joint work with Andrew Rosenberg and Fadi Biadsy -- and thanks to Wisam Dakka

SRI, 26 July 2007


 
 

What is Charisma?  

  • The ability to attract, and retain followers by virtue of personal characteristics -- not traditional or political office (Weber ‘47)
    • E.g. Gandhi, Hitler, Castro, Martin Luther King Jr.,..
    • Personalismo
  • What makes an individual charismatic? (Bird ’93, Boss ’76, Dowis ’00, Marcus ’67, Touati ’93, Tuppen ’74, Weber ‘47)
    • Their message?
    • Their personality?
    • Their speaking style?
 
 

What is Charismatic Speech? 

  • Circularly…
    • Speech that leads listeners to perceive the speaker as charismatic
  • What aspects of speech might contribute to the perception of a speaker as charismatic?
    • Content of the message?
    • Lexico-syntactic features?
    • Acoustic-prosodic features?
 
 

Is Charisma a Culture-Dependent Phenomenon? 

  • Do people of different languages and cultures perceive charisma differently?
  • Do they perceive charismatic speech differently?
    • Do Arabic listeners respond to American politicians the same way Americans do?
    • Do Americans hear Swedish professors the same way Swedish students do?

 


 
 

Why Study Charismatic Speech? 

  • It’s an interesting phenomenon
  • To identify potential charismatic leaders
  • To provide a feedback system for individuals who want to improve their speaking style -- politicians, professors, students…
  • To create a charismatic Text-to-Speech system
 
 

Our Approach 

  • Collect tokens of charismatic and non-charismatic speech from a small set of speakers on a small set of topics
  • Ask listeners to rate the ‘The speaker is charismatic’ plus statements about a number of other attributes (e.g. The speaker is …boring, charming, persuasive,…)
  • Correlate listener ratings with lexico-syntactic and acoustic-prosodic features of the tokens to identify potential cues to perception of charisma
 
 

American English Perception Study 

  • Data: 45 2-30s speech segments, 5 each from 9 candidates for Democratic nomination for U.S. president in 2004
    • 2 ‘charismatic’, 2 ‘not charismatic’
    • Topics: greeting, reasons for running, tax cuts, postwar Iraq, healthcare
    • 4 genres: stump speeches, debates, interviews, ads
  • 8 subjects rated each segment on a Likert scale (1-5) for 26 questions in a web survey
  • Duration: avg. 1.5 hrs, min 45m, max ~3hrs
 
 

Results:  How Much Do Subjects Agree with Each Other? 

  • Over all statements?
    • Using weighted kappa statistic with quadratic weighting, mean  = 0.207
  • On the charismatic statement?
      •  = 0.232 (8th most agreed upon statement)
  • By token?
    • No significant differences across all tokens
  • By statement?
    • Individual statements demonstrate significantly different agreements (most agreement: The speaker is accusatory; least agreement: The speaker is trustworthy)
 
 

Results:  What Do Subjects Mean by Charismatic? 

  • Which other statements are most closely correlated with the charismatic statement? (determined by kappa):  a functional definition
 

-0.513 

The speaker is boring 

0.577 

The speaker is persuasive 

0.499 

The speaker is convincing 

0.543 

The speaker is passionate 

0.575 

The speaker is charming 

0.620 

The speaker is enthusiastic


 
 

Results:  Does Whether a Subject Agrees with the Speaker or Finds the Speaker ‘Clear’ Affect Charisma Judgments 

  • Whether a subject agrees with a token does not correlate highly with charisma judgments (0.30)
  • Whether a subject finds the token clear does not correlate highly with charisma judgments (0.26)
 
 

Results:  Does the Identity of the Speaker Affect Judgments of Charisma? 

  • There is a significant difference between speakers (p=2.20e-2)
  • Most charismatic
    • Rep. John Edwards (mean 3.86)
    • Rev. Al Sharpton (3.56)
    • Gov. Howard Dean (3.40)
  • Least charismatic
    • Sen. Joseph Lieberman (2.42)
    • Rep. Dennis Kucinich (2.65)
    • Rep. Richard Gephardt (2.93)
 
 

Results:  Does Recognizing a Speaker Affect Judgments of Charisma? 

  • Subjects asked to identify which, if any, speakers they recognized at the end of the study.
  • Mean number of speakers believed to have been recognized, 5.8
  • Subjects rated ‘recognized’ speakers as significantly more charismatic than those they did not (mean 3.39 vs. mean 3.30).
 
 

Results:  Does Genre or Topic Affect Judgments of Charisma? 

  • Recall that tokens were taken from debates, interviews, stump speeches, and campaign ads
    • Genre does influence charisma ratings (p=.00035)
    • Stump speeches were the most charismatic (3.38)
    • Interviews were the least (2.96)
  • Topic did not affect ratings of charisma significantly (p=.059) altho
    • Healthcare > post-war Iraq > reasons for running  neutral > taxes
 
 

What makes Speech Charismatic
Features Examined
 

  • Duration (secs, words, syls)
  • Charismatic speech is personal:  Pronoun density
  • Charismatic speech is contentful:  Function/content word ratio
  • Charismatic speech is simple:  Complexity: mean syllables/word (Dowis)
  • Disfluencies
  • Repeated words
 
  • Min, max, mean, stdev F0 (Boss, Tuppen)
    • Raw and normalized by speaker
  • Min, max, mean, stdev intensity
  • Speaking rate (syls/sec)
  • Intonational features: 
    • Pitch accents
    • Phrasal tones
    • Contours
 
 

Results:  Lexico-Syntactic Correlates of Charisma 

  • Length: Greater number of words positively correlates with charisma (p=0)
  • Personal pronouns:
    • Density of first person plural and third person singular pronouns positively correlates with charisma (p=0, p=0)
    • Third person plural pronoun density correlates negatively with charisma (p=1.47e-5)
  • Content: Higher ratio of function/content words positively correlates with charisma (p=.035)
  • Complexity: Higher mean syllables/word positively correlates with charisma
 
 
  • Disfluency: greater % negatively correlates with charisma (p=6.02e-5)
  • Repetition: Proportion of repeated words positively correlates with charisma (p=.001)

 


 
 

Results:  Acoustic-Prosodic Correlates of Charisma 

  • Pitch:
    • Higher F0 (mean, min over male speakers) positively correlates with charisma (p=1.98e-6, p=0)
    • Normalized mean F0 positively correlates with charisma and normalized max approaches significant correlation (p=.013, p=.064)
  • Loudness: Greater mean intensity tends to positively correlate with charisma (p=.053)
  • Speaking Rate:
    • Faster overall rate shows positively correlates with charisma (p=.000)

 


 
 
    • Faster rate within fastest intonational phrase does too (p=.004)
  • Duration: Longer duration correlates positively with charisma (p=.015)
  • Length of pause:  sdev negatively correlates with charisma
 
 

Results:  Prosodic Correlates of Charisma (Hand-Annotated Features) 

  • Pitch Accent Type:
    • Positive correlation with H* accents (p=.004)
    • Negative correlation with L*+H accents (p=.003) and with L* accents (p=1.15e-5)
  • Phrasal Types
    • Negative correlation with rising phrase boundaries (p=.001)
    • Negative correlation with downstepped contour presence in token (p=.0025)
 
 

Summary 

  • In Standard American English, charismatic speakers tend to be those also highly rated for enthusiasm, charm, persuasiveness, passionateness and convincingness – they are not thought to be boring
  • Charismatic utterances tend to be longer than others, to contain a smaller proportion of content to function words, a higher density of first person plural and third person singular pronouns and fewer third person plurals, fewer disfluencies, a larger percentage of repeated words, and more complex words than non-charismatic utterances
 
 
  • Charismatic utterances are higher in pitch (mean, min) and with more regularity in pause length, greater in intensity, faster, and with more H* accents than L* and L*+H, fewer rising contours, and fewer downstepped contours
 
 

Replication of Perception Study from Text Alone 

  • Lower statement agreement, much less on charismatic statement, different speakers most/least charismatic
  • `Agreement with speaker’, genre and topic had stronger correlations
  • Lexico-syntactic features show weaker correlations
    • 1st person pronoun density negatively correlated and complexity not at all
    • Similar to speech experiment for duration, function/content, disfluencies, repeated words
 
 

Charisma Across Cultures 

  • Is the same true for charismatic utterances in other languages and cultures?
  • If a Palestinian Arabic speaker judges charisma from Arabic utterances, will we find similar or different correlates of charismatic speech?
  • If an American listens to Palestinian speech, will their judgments be similar to Palestinians?  If a Palestinian listens to our American tokens, will their judgments be similar to our American listeners?
 
 

Charismatic Speech in Palestinian Arabic  

  • Are these tokens charismatic?:
  • Are these?:  
 
 

Palestinian Arabic Perception Study 

  • Same paradigm as for SAE
  • Materials: 
    • 44 speech tokens from 22 male native-Palestinian Arabic speakers taken from Al-Jazeera TV talk shows
    • Two speech segments extracted for each speaker from the same topic (one thought charismatic and one not)
  • Web form with statements to be rated translated into Arabic
  • Subjects: 12 native speakers of  Palestinian Arabic
 
 

Data  

10.3 minutes 

Total corpus duration: 

3 seconds 

Token with min duration: 

28 seconds 

Token with max duration: 

14 seconds 

Average token duration: 

30 words 

Average number of words in token: 

65 words 

Token with max words: 

9 words 

Token with min words: 

1322 words 

Total number of words:


 
 

How Does Charisma Differ in Arabic? 

  • Subjects agree on judgments a bit more (κ=.225) than for English (κ=.207) but still low
    • Agree most on clarity of msg, enthusiasm, charisma, intensity, anger of speaker
    • Agree least on spontaneity, ordinariness, friendliness, desperation, passionateness of speaker
    • Charisma statement correlates (positively) most strongly with speaker toughness, powerfulness, persuasiveness, enthusiasm, charm, and negatively with boringness
 
 
  • Role of speaker identity important in judgments of charisma in Arabic as in English
    • Most charismatic speakers: Ibrahim Hamami (4.75), Azmi Bishara (4.42), Mustafa Barghouti (4.33)
    • Least: Shafiq Al-Hoot (3.10), Azzam Al-Ahmad (3.33), Mohammed Al-Tamini (3.42)
    • Raters claimed to recognize only .55 speakers on average, perhaps because the speakers were less well known than the Americans
  • Topic important in judgments of charisma (p=.043)
 
 
    • Israeli separation wall > assassination of Hamas leader > debates among Palestinian groups > the Palestinian Authority and calls for reform > the Intifada and resistance
 
 

Lexical Cues to Charisma 

  • Length in words positively correlates with charisma, as in English
  • Disfluency rate negatively correlates, as in English
  • Repeated words positively correlates with charisma, as in English
  • Presence of Arabic ‘dialect’ (words, pronunciations) negatively correlates with charisma
 
 
  • Density of third person plural pronouns positively correlates w/ charisma – differing from English

 


 
 

Acoustic/Prosodic Cues to Charisma 

  • Duration was positively correlated with charisma, as in English
  • Speaking rate approached negative correlation – opposite from English
    • But rate of the fastest intonational phrase in the token positively correlated for both languages
    • Sdev of rate across intonational phrases positively correlated for charisma in Arabic
  • Pauses
    • #pauses/words ratio positively correlated with charisma – not in English
 
 
    • Sdev of length of pause positively correlated in Arabic but negatively in English
  • Pitch: 
    • Mean pitch positively correlates (as in English) but also F0 max and sdev
    • Min pitch negatively correlates (opposite from English)
  • Intensity: Sdev positively correlates w/ charisma

 


 
 

How Are Perceptions of Charisma Similar Across Cultures? 

  • Level of subject agreement on statements
  • Role of speaker ID in charisma judgments
  • Positive correlations with charisma
    • Duration, repeated words
    • Speaking rate of fastest IP
  • Negative correlations with charism
    • Disfluencies
 
 

How Do Charisma Judgments Differ Across Cultures? 

  • Statements most and least agreed upon
  • Role of topic in charisma judgments
  • Positive correlations with charisma
    • Sdev of speaking rate, pause/word ratio, sdev of pause length, F0 max and sdev, sdev intensity
  • Negative correlations with charisma
    • Dialect, density of third person plural pronouns
    • Speaking rate, min F0

 


 
 

Future Work 

  • Complete machine learning experiments on automatic detection of charisma
  • Complete perception experiments of Arabic with American listeners and American English with Palestinian listeners
  • Swedish, Japanese next….
  • And….
    • Perception studies for resynthesized American and Arabic tokens
    • An automatic charisma scorer for American English and Palestinian Arabic
 
 

Thank you!


 
 

Arabic Prosodic Phenomena  
MSA vs. Dialect
 

  • A word is considered dialectal if:
    • It does not exist in the standard Arabic lexicon
    • It does not satisfy the MSA morphotactic constraints
    • Phonetically different (e.g., ya?kul vs. ywkil)
  • In corpus of tokens
    • 8% of the words are dialect.
    • 80% of the dialect words are accented.

 


 
 

Arabic Prosody: Accentuation 

  • 70% of words are accented
  • 60% of the de-accented words are function words or disfluent items
    • Based on automatic POS analysis (MADA)  
    • 12% of  content words are deaccented
  • Distribution of accent types:
    • H* or !H* pitch accent, 73%
    • L+H* or L+!H*, 20%
    • L*, 5%
    • H+!H*, 2%
 
 

Arabic Prosody: Phrasing 

  • Mean of 1.6 intermediate phrases per intonational phrase
  • Intermediate phrases contain 2.4 words on average
  • Distribution of phrase accent/boundary tone combinations
    • L-L% 59%
    • H-L% 26%
    • L-H% 8%
    • H-L% 6%
    • H-H% 1%

 


 
 

Arabic Prosody – most common contours 

2.1 

L+H* H- 

2.3 

H* !H* !H* L- 


H* H* H- 


L+H* !H* L- 

4.1 

L* L- 

4.1 

H* !H* L- 

7.6 

H* H* L- 

9.7 

L+H* L- 

13.4 

H* H- 

21.9 

H* L-


 
 

Arabic Prosody – Disfluency 
 

  • In addition to standard disfluency:
    • Hesitations
    • filled pauses
    • self-repairs
  • In Arabic, speakers could produce a sequence of all of the above. (see praat: file: 1036 and 2016)
  • Disfluency may disconnect prepositions and conjunctions from the content word: 
    • ولتأتي => و ... لـ ... يعني ... تأتي 
    • w- l- uh- yEny uh- t?ty  instead of wlt?ty

Set Home | Add to Favorites

All Rights Reserved Powered by Free Document Search and Download

Copyright © 2011
This site does not host pdf,doc,ppt,xls,rtf,txt files all document are the property of their respective owners. complaint#nuokui.com
TOP