Text-to-Speech web app with Vue, OpenAI and Go
Before we started working with React at e.GO Mobile, we used Vue.
This time I will show you how to quickly build a nice looking Single Page Application with Vuetify, which converts text to audio (in other words: reads the text aloud), using Open AI’s text-to-speech API.
As usual, there is an example project on GitHub for trying out:
The backend is written in Go with support of automatic restart on code changes.
Requirements
You need an installation of Docker with docker-compose to run both, the frontend and backend, via docker-compose.yml at the same time.
The rest is available and will be handled in the containers.
Backend
The Go backend uses FastHTTP and the compatible fasthttprouter for handling the requests with optional CORS support.
Automatic restart of the backend on code change is not supported automatically. For this I had to setup the backend
service in docker-compose.yml
with Air by Rick Yu:
backend:
image: cosmtrek/air
working_dir: /go/src/app
env_file:
- ./.env
- ./backend/.env
- ./backend/.env.local
environment:
- GO111MODULE=on
ports:
- 4000:4000
volumes:
- ./backend:/go/src/app
Before all will work, you have to update the .env.local
in backend subfolder with a valid OpenAI API key by adding OPENAI_API_KEY
entry.
The REST service will provide a POST endpoint at http://localhost:4000
, which expects the text and the voice for the transmission:
// ...
router.POST("/", func(ctx *fasthttp.RequestCtx) {
sendError := func(err error) {
ctx.SetStatusCode(500)
ctx.WriteString(err.Error())
}
// read data from request
postBodyData := ctx.PostBody()
var postBody TextToSpeechRequestBody
err := json.Unmarshal(postBodyData, &postBody)
if err != nil {
sendError(err) // parsing input as JSON failed
return
}
// setup request for OpenAI
requestBody := map[string]interface{}{
"model": "tts-1",
"input": postBody.Text,
"response_format": "opus",
"voice": postBody.Voice,
}
requestBodyData, err := json.Marshal(requestBody)
if err != nil {
sendError(err) // could not create JSON string from `requestBody`
return
}
// start the request
request, err := http.NewRequest("POST", "https://api.openai.com/v1/audio/speech", bytes.NewBuffer(requestBodyData))
if err != nil {
sendError(err) // failed
return
}
// setup headers like API key
request.Header.Set("Content-Type", "application/json")
request.Header.Set("Authorization", fmt.Sprintf("Bearer %s", OPENAI_API_KEY))
client := &http.Client{}
response, err := client.Do(request)
if err != nil {
sendError(err) // could not doing the request
return
}
responseBodyData, err := io.ReadAll(response.Body)
if err != nil {
sendError(err) // could not read the response
return
}
if response.StatusCode != 200 {
sendError(fmt.Errorf("unexpected response: %v", response.StatusCode))
return
}
// now prepare response data
// for returning as data URI
base64ResponseBodyData := base64.StdEncoding.EncodeToString(responseBodyData)
responseBodyDataURI := fmt.Sprintf("data:audio/ogg;base64,%s", base64ResponseBodyData)
responseBodyDataURIData := []byte(responseBodyDataURI)
ctx.SetStatusCode(200)
ctx.Response.Header.Add("Content-Length", fmt.Sprint(len(responseBodyDataURIData)))
ctx.Response.Header.Add("Content-Type", "text/plain; charset=UTF-8")
ctx.Write(responseBodyDataURIData)
})
// ...
Frontend
The frontend is set uped with Vuetify, Vite and TypeScript.
If you want to create a new Vuetify project, simply run
npm create vuetify
in your terminal and follow the instructions.
When you start a project with npm run dev
, the frontend will be reloaded on each code change.
The web app uses the previously mentioned backend endpoint and plays the audio data as a data URI using the browser’s Audio API:
<script lang="ts">
import axios from 'axios';
export default {
data: () => {
return {
isSendingText: false,
lastStatus: '',
textValue: '',
voice: 'Alloy'
};
},
methods: {
async SendText() {
const textToSend = String(this.textValue).substring(0, 4000).trim();
if (textToSend === "") {
return; // not enough data
}
const voice = String(this.voice).toLowerCase().trim() || "alloy";
this.isSendingText = true;
try {
const {
data,
status
} = await axios.post('http://localhost:4000/', {
text: textToSend,
voice
}, {
responseType: 'text'
});
if (status !== 200) {
throw new Error(`Unexpected response: ${data}`);
}
const audio = new Audio(data);
audio.volume = 0.5;
audio.play();
console.log('Playing audio ...');
} catch (error) {
console.error('[ERROR]', 'SendText()', error)
} finally {
this.isSendingText = false;
}
}
}
}
</script>
Start the demo
Before you can run
docker compose up
you have to ensure that the following folders contain a .env.local
file:
After anything started, you should be able to open the frontend with http://localhost:3000
Conclusion
You see again: No big deal 😎
Vuetify is a very nice and powerful open source UI framework for SPAs.
Now with information from my previous posts, you should be able now to create your own kind of Alexa 😉
Have fun while trying it out! 🎉