Learn To Build Real-World AI Solutions

✓ Join 1,000+ developers building AI apps that actually ship
✓ Get full source code, templates, and commercial-use rights
✓ Based on apps and tools built for paying clients

👉 Start Building Now — Free for 7 Days

“ A very nice and precise lesson plan. The money spent is well invested.

★★★★★

Achim Dehnert , Professor at Neu-Ulm University

Creating an AI Companion with GPT-4o

gpt-4o openai May 21, 2024

Have you noticed how AI companions are becoming a big deal lately? The market for these virtual friends hit an impressive $2.8 billion in 2023.

In this article, I'm going to walk you through how to create your very own AI girlfriend using some of the latest tech from OpenAI. We'll be diving into GPT-4o, OpenAI Whisper, and their advanced text-to-speech (TTS) technologies.

By the time we're done, you'll have a cool AI companion that can understand and chat with you in real-time. Let's get started!

First, we'll use Whisper to turn your spoken words into text. Then, GPT-4o will take that text and come up with a response.

Finally, we'll use the Audio API to convert the response back into spoken audio. This way, you'll have a smooth, seamless conversation with your AI girlfriend.

Why GPT4-o?

The new GPT-4o is a game-changer in the world of AI language models. It's super fast, cost-efficient, and has advanced multimodal capabilities.

In the past, we needed solutions like Groq for real-time responses, but now GPT-4o takes it to the next level. It's much faster than GPT-4, making real-time communication possible for the first time.

Plus, compared to GPT-4 Turbo, GPT-4o is not only quicker but also 50% cheaper. That makes it perfect for creating interactive applications like an AI girlfriend.

Let’s Build Your AI Girlfriend

Alright, let's get started! First things first, we need to install the necessary libraries:

pip install openai SpeechRecognition pygame setuptools

Next, we'll handle audio recording and playback. Create a file named utils.py and add the necessary methods.

import speech_recognition as sr
import pygame
import time

def record_audio(file_path):
  recognizer = sr.Recognizer()
  with sr.Microphone() as source:
    print("Please say something...")
    audio_data = recognizer.listen(source)
    print("Recording complete.")
      with open(file_path, "wb") as audio_file:
      audio_file.write(audio_data.get_wav_data())

def play_audio(file_path):
  pygame.mixer.init()
  pygame.mixer.music.load(file_path)
  pygame.mixer.music.play()
  # Wait until the audio is finished playing
  while pygame.mixer.music.get_busy():
    time.sleep(1)

The record_audio function captures audio from your microphone, saves it to a specified file, and lets you know when the recording starts and stops.

The play_audio function sets up the Pygame mixer, loads your audio file, and plays it back, waiting until it's done.

Now, let's implement the main logic in a file named app.py. This will include initializing the OpenAI client, transcribing audio, and generating responses

from openai import OpenAI
from utils import record_audio, play_audio

client = OpenAI()

while True:
  record_audio('test.wav')
  audio_file= open('test.wav', "rb")
  transcription = client.audio.transcriptions.create(
    model="whisper-1",
    file=audio_file
  )

  print(transcription.text)

  response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
      {"role": "system", "content": "You are my girlfriend. Please answer in short sentences and be kind."},
      {"role": "user", "content": f"Please answer: {transcription.text}"},
    ]
  )

  print(response.choices[0].message.content)

  response = client.audio.speech.create(
    model="tts-1",
    voice="nova",
    input=response.choices[0].message.content
  )

 response.stream_to_file('output.mp3')
 play_audio('output.mp3')

This code sets up a continuous loop where it:

Records your audio input.
Transcribes it into text using OpenAI’s Whisper model.
Prints out the transcription.

Next, it uses GPT-4o to generate a response based on what you said, converts that response into speech with a text-to-speech model, and plays the audio back to you.

The record_audio and play_audio functions take care of recording and playing back the audio, respectively.

Conclusion

Creating an AI girlfriend using GPT-4 and OpenAI Whisper is a fascinating and educational project. By following this guide, you've built a functional AI companion capable of real-time interactions.

As you get more comfortable with these technologies, don't hesitate to explore more advanced features and enhancements. Happy coding!

Ready to turn your AI ideas into real apps?

Access step-by-step courses & launch-ready tools

👉 Start Building Now - Free For 7 Days