This example shows how to use the IBM Watson Speech to Text service to recognize the type of an audio file and produce a transcription of the spoken text in that file.
This example requires Speech to Text service credentials and Node.js
$ npm install watson-developer-cloud
username
and password
for your Speech to Text service instance.var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1');
var fs = require('fs');
var speech_to_text = new SpeechToTextV1({
username: 'INSERT YOUR USERNAME FOR THE SERVICE HERE',
password: 'INSERT YOUR PASSWORD FOR THE SERVICE HERE',
url: 'https://stream.watsonplatform.net/speech-to-text/api'
});
var params = {
content_type: 'audio/flac'
};
// Create the stream,
var recognizeStream = speech_to_text.createRecognizeStream(params);
// pipe in some audio,
fs.createReadStream('0001.flac').pipe(recognizeStream);
// and pipe out the transcription.
recognizeStream.pipe(fs.createWriteStream('transcription.txt'));
// To get strings instead of Buffers from received `data` events:
recognizeStream.setEncoding('utf8');
// Listen for 'data' events for just the final text.
// Listen for 'results' events to get the raw JSON with interim results, timings, etc.
['data', 'results', 'error', 'connection-close'].forEach(function(eventName) {
recognizeStream.on(eventName, console.log.bind(console, eventName + ' event: '));
});
Save the sample audio file 0001.flac to the same directory. This example code is set up to process FLAC files, but you could modify the params
section of the sample code to obtain transcriptions from audio files in other formats. Supported formats include WAV (type audio/wav
), OGG (type audio/ogg
) and others. See the Speech to Text API reference for a complete list.
Run the application (use the name of the file that contains the example code)
$ node app.js
After running the application, you will find the transcribed text from your audio file in the file transcription.txt in the directory from which you ran the application.