Voice recognition: performance text

‘Voice Recognition: A Play (after Gertrude Stein, William Shakespeare and Rolf Harris)’


The body and the surplus are there and the surplus is there and the body is there

and where oh where is their identity, is the identity there anywhere.

The speaker and the dog are there and the dog is there and the speaker is there

and where oh where is their identity, is the identity there anywhere.

there  there  their  their  dere  dere  there  deir dere dey are

Scene I: code of capture

I am I because my voice gives me away. The figure wanders on alone.

The little voice does not appear

because if it did then there would be nothing to fear.

Positioned neatly before the microphone / hair brushed and parted /

he is poised to produce “dulcet tones” for transatlantic dispatch.

Two little boys had two little toys

A shibboleth, right there in the first line, to trip you up.

boy   toy   boys   toys   bie   tie

The machine is implacable, in chrome and plastic

this alien artefact with an attendant minion.

The acolyte in black recedes into the shadows of the family group

foregrounds the glinting and whirring box.

The little surplus is not alone because no little surplus could be alone.

If it were alone it would not be there.

So then the origin has to be like this.

Each had a wooden horse  /  Gayly they played each summer’s day

Will the machine prove wilier than any scheme of masking or feinting,

revealing in “playback” poor efforts at “passing”,

highlighting hybrid noises made between hiding and showing,

between not saying and being heard.

Voices smell like voices / Songs signify like songs / Men sound like men.

Warriors both of course

ORIGIN 2: What men do

Policed by a macho code that is not always clear

hyper correctness or speaking too well — too properly – is outlawed

One little chap then had a mishap  /  Broke off his horse’s head

in parallel with prohibitions against tale-telling – crying – sulking.

Wept for his toy then cried with joy

In speech marked by faults correct and wrong / bad and good /

allowed and condemned

displays a mixture of verbal errors, of failed righting

sissy  lisping  poofy  pitching  bent

expression waltzes a wavy path between straight and queer.

ferminirne effermirnarste poncyence ponstcinerss  // 

hard won (woarn)  s(t)olidity marlenerss

He works at not being noticed for being too right –

not staying enough the right side of wrong to pass.

Try an origin again. Another code.

There is any difference between pouring and spilling.

As his young playmate said:

 A CODE: popular song

There is no in between in a code.

An origin could just as well only mean two.

Clustering algorithms are used on training data [data]

to find a set of properties that best distinguish the speech data [data].

The vocal excess. The vocal excess too.

Vocal signature. Vocal signature does not have it to do.

What can a surplus do and with spilling too.

“Did you think I would leave you crying  /  When there’s room on my horse for two

I had seen grown-ups crying listening to singers,

singers there in the room with them and singers on records

Nobody knows what the proper speech is when they are drunk.

Everybody who has a Presley has had a Sinatra and that Sinatra has had a McCormack.

This actually is true of a Carpenter who was a Ferrier and Crosby had a Gigli.

Climb up here Jack and don’t be crying  /  I can go just as fast with two

Singing brings the voice as object to the fore.

What is the use of being a little boy if you are to grow up to be a singer.

When we grow up we’ll both be soldiers

ANOTHER SONG: scene of capture 2

A singer coming. Yes, there is a great deal of use in a singer coming /

but will he come at all / if he does come will he come here.

It is a new factor in the hazard-filled verbal landscape,

this dull mechanical thing without soul or understanding.

trace // trace // trace

An impersonal mode of copying that is truer and can strip off layers of disguise –

peel away efforts made to pass — to operate unmarked in the speech terrain.

This unthinking tool of transcription

takes the taped traces wound up inside it

to be audited by far-flung relations nodding and smiling

“he’s one of them” “you can hear it”

“even if he doesn’t know it himself” “it’s there in the voice”

Well anyway he does sing and if he likes it he will sing again.

And our horses will not be toys  /  And I wonder if we’ll remember  /  When we were two little boys”

ACT 1 SCENE: hearing others

Now this is the way I had coded that code. But not at all not as one is one.

What are the indicators of alterity the sound tracks of otherness?

How might they be listed / lisped / lisp’ed / lips’sted …?

Long years had passed, war came so fast

He gathers into an assembly other voices in the culture that are marked /

that are marked by sniggers or laughter they provoke or by a hush they prompt

Bravely they marched away

Other electronic voices / mediated speeches / they stand out from the ranks –

draw attention to themselves –

they other their speakers against a ‘normal’ vocal sonic ground

[base tone – aaaaaaaawwwwwwhhhhh ]

CODE II: ooo00OOHHhhhh!

There were these other voices in circulation that he heard as a route not to follow,

as a model to avoid, to reject as a pattern for his audible expression.

Becoming attuned to the expectations and responses of the auditors of those other voices

he could attune to them too and avoid following them into tell-tale telling.

[Chorus] It has nothing to do / vocal signature has nothing to do with anything.

[Chorus] No not with a surplus

[Tears] No not with a surplus.

[Chorus] I am I because my little surplus knows

[Chorus] Yes, there I told you, vocal signature is not at all outpouring.

Cannon roared loud, and in the mad crowd  /  Wounded and dying lay

SCENE 1: at a soccer match

A dog trips over a word because it is a word that tripped any one.

The expert scrutinizes for speech habits, psycholinguistic features,

dialect, inflection, syllable grouping, and breath patterns. / uhuhhh/

Like profilers, they are trying to put themselves in the mind of the suspect.

He has forgotten that he has been tripped by a word

no not forgotten because this one the same one is not the one that can trip any one.

Up goes a shout, a horse dashes out

An OI! is inserted into this line in the song by Hartlepool United fans,

Poolies as they name themselves

Up goes a shout, OI!

It makes present the shout that goes up. This ‘oi’ is a tricky diphthong in English as she is spoken in Ireland, popping up in ‘boy’ ‘toy’ ‘voice’ ‘joy’, making this song an obstacle course for the singer.

Out from the ranks so blue

‘So blue’ is repeated by the Poolies, blue being the Hartlepool team colour

and so worthy of emphasis.

… so blue / so blue

Speech excessive in vocabulary or in propriety or in pitch or in performance is suspect,

and these causes of suspicion are pounced upon and drawn out for examination,

for exaggerated repetition, for mocking re-sounding

/ NA / NA / NA / NA / NA

What team do you play for?  Which foot do you kick with?  Whose side are you on?

A hazard with the shibboleth is you can get yourself caught in a double bluff,

switching to what you hope is the correct response and in the switch betraying yourself

and revealing your betrayal and your ‘self’.

Gallops away to where Joe lay  /  Then came a voice he knew:

A PLAY: The question of identity

I am I because my little dog knows my voice.

My ears have not yet drunk a hundred words of that tongue’s utterance, yet I know the sound.

An emblem of the fidelity of the recording process [like painted grapes]

the dog is held by the copied voice,

and the faithful dog sits listening to the faithful copy.

He recognises ‘his dead master’s voice’

[old Argos hears and knows Odysseus voice and then dies].

The HMV logo reproduces a scene of listening,

it offers us an entertainment of Nipper the terrier

with his head cocked in an attitude of attention,

an imitation of human listening.

“Did you think I would leave you dying

The dog listens like us / the dog remembers (like us) /

the dog misunderstands the mechanism of mimicry.

it is nor hand, nor foot, Nor arm, nor face, nor any other part Belonging to a man

At best, voiceprint identification is like polygraphy (lie detection)

and […] makes for a valuable investigative tool to screen potential suspects.

When there’s room on my horse for two  /  Climb up here Joe, we’ll soon be flying

ANOTHER PLAY: another ploy

The shift or slip of a vowel on or across invisible boundaries.

The tape-recording would drag the secret mouth play into the open.

Alterity becomes audible in isolated — amplified — augmented — shame-inducing focus.

I can go just as fast with two

A voice originating within the body suggests a ‘true’ connection to the ‘being’ of the speaker,

pulling it out and putting it on show strips that ‘I’ naked;

it turns it inside out; and makes a show of it,

of a glamorous costume, a flamboyant get-up, a gorgeous outfit.

Did you say Joe I’m all a-tremble  /  Perhaps it’s the battle’s noise

ACT 1 SCENE IV: the uncanny

The voice I hear returned to me is me but it is not ‘I’

with a shiver I recognise and deny recognition

this is not the voice in my head, the voice I project

in that shudder I know that this is me and that the little machine knows me.

But I think it’s that I remember  /  When we were two little boys

ACT 1 SCENE III: working with statistics

And so surpluses and vocal signature have no identity.

Some words will correspond to codebook symbol sequences, it makes sense to use an HMM

[a hidden Markov model] for each word. These word HMMs would need to be combined with

the language model to produce an HMM model for sentences.

I like an origin of acting so and so / and a surplus / my surplus is any one of not one.

But we in Ireland are not displaced by a surplus / oh no / no not at all /

not at all at all displaced by a surplus.

Do you think I would leave you dying  /  There’s room on my horse for two

hidden Markov model / hmmmm

Using this statistical modelling in the development of electronic speech recognition

bases the recognition on an expectation of what will next be said,

be sounded, be spoken, on probabilities of emission.

If productions of queer or other voices be unexpected / improbable / will these emissions be missed

and remain unmapped and electronic speech become a reproduction toward a norm.

Climb up here Joe, we’ll soon by flying  /  Back to the ranks so blue

ACT 1 SCENE 1: bodily emissions

But later well later the voice is older.

And so the voice roams around / he knows the one he knows /

but does that make a difference. A song is exactly like that.

In cutting the voice off from the body copying inserts a gap.

No longer enclosed by, close to, the singer’s body the trace is heard as other than /

other to / other from him.

Do its characteristics float free of their origin

becoming untied traits loosely dispersing in disembodied (freedom)?

But a heard voice is heard as from and of a body,

and the recorded voice prompts the hearer to assign it a source,

a body that is the source of the excessive vocal trace.

Can you feel Joe I’m all a tremble

excess of emotion / excess of expression / excess of restraint / excess of presence /

excess of meaning / excess of noise / excess of information

too much it is too much

it spills its speaker its sender its ‘I’ its ‘me’ over limits into undecided and improper spaces

[Chorus: But there is no remembering the faithful copy.]

[Chorus:  There is no memory in the proper speech.]

He manages utterances to communicate information.

Adopting another layer of control, he notes the codes in operation

and switches among those codes.

If I am I then my voice gives me away.

Perhaps it’s the battle’s noise  /  But I think it’s that I remember  /  When we were two little boys”

SCENE II: an ending

It might desire something but it does not sing again.

And so to make excitement and not nervousness into a song.

And then to make a code with just faithful copy. (Let us try.)

When we were two little boys”

To make a song with proper nature and not anything of the proper speech.

To make a code with faithful version and not anything of the faithful copy.

And does a little voice making a noise make the same noise.

When we were two little boys”

O, for a falconer’s voice, To lure this tassel-gentle back again! Bondage is hoarse, and may not speak aloud; Else would I tear the cave where Echo lies, And make her airy tongue more hoarse than mine, With repetition of my Romeo’s name.

(the,   to,       and,   me,     on,      is,        you,    I,          it,        a)

the ten words on which identification can best be based   (the, to, and, me, on, is, you, I, it, a)

When we were two little boys”

A sound, once made (even by the same individual) can never be exactly replicated in all its characteristics.

The voice answers without asking because the voice is the answer to anything that is that voice.

When we were two little boys”

An end of a song is not the end of a day.

Mark Leahy

note on the text:

In the winter of 1969-70 ‘Two Little Boys’ was a number one hit in the UK and Ireland. Some time after this my cousin and I were pressed to sing the song together, to be recorded by a visiting American relation. The scene has remained with me. It was my first encounter with my recorded voice, and is wrapped up with questions of identity, difference, and becoming. It also raises questions of what may be found in the reproduction of spoken or sung text. What is this unique trace? Who do we hear here?

The performance text uses Gertrude Stein’s ‘Identity: A Poem’ as a base onto which is layered material around the relation of voice to self, of speech to body, and of the spoken or heard constructions of gender, sexuality or belonging. The song ‘Two Little Boys’ is expanded and commented on, and the text draws from the balcony scene in Shakespeare’s Romeo and Juliet, as well as from research into voice recognition technology, and local linguistic traces.