RealChar

RealChar позволяет создавать и взаимодействовать с AI-персонажами в реальном времени. Настройте личность, голос и общайтесь через веб, мобильные и терминальные платформы без кода. Создайте своего идеального AI-компаньона!

Мобильная разработка Инструменты разработчика Data Science и ML ★ 6,139 GitHub (6,139 ★)

RealChar — это универсальная платформа с открытым исходным кодом, предназначенная для создания, настройки и интерактивного общения с AI-персонажами в реальном времени. Она обеспечивает естественные и плавные диалоги с использованием передовых AI-технологий, таких как OpenAI GPT-3.5/4, Anthropic Claude 2 и Anyscale Llama 2 для языковых моделей, дополненных Whisper для распознавания речи и ElevenLabs для синтеза речи. Благодаря поддержке множества платформ, включая веб, мобильные устройства и терминал, RealChar отличается высокой модульностью архитектуры, что делает его отличной отправной точкой для проектов в области AI-инженерии.

Ключевые возможности

016,139 звёзд на GitHub

02Настраиваемый: Изменение личности, истории и голоса AI-персонажа.

03Простота использования: Для создания собственного AI-персонажа не требуется навыков программирования.

04Модульность: Простая замена различных AI-модулей (LLM, распознавание речи, синтез речи).

05Режим реального времени: Разговор или переписка с AI-персонажем в реальном времени.

06Мультиплатформенность: Взаимодействие с AI-персонажами через веб, терминал и мобильные устройства.

Сценарии использования

01Создание персонализированных AI-компаньонов для интерактивного общения.

02Разработка голосовых AI-приложений в реальном времени для различных клиентских платформ.

03Эксперименты и интеграция различных передовых AI-моделей (LLM, STT, TTS).

RealChar. - Your Realtime AI Character

🎙️🤖Create, customize and talk to your AI Character/Companion in realtime🎙️🤖

✨ Demo

Try our site at RealChar.ai

Not sure how to pronounce RealChar? Listen to this 👉 audip

Demo 1 - with Santa Claus!

Santa_tw.mp4

Demo 2 - with AI Elon about cage fight!

elon-edit-camera.mp4

Demo 3 - with AI Raiden about AI and "real" memory

raiden.mp4

Demo settings: Web, GPT4, ElevenLabs with voice clone, Chroma, Google Speech to Text

🎯 Key Features

Easy to use: No coding required to create your own AI character.
Customizable: You can customize your AI character's personality, background, and even voice
Realtime: Talk to or message your AI character in realtime
Multi-Platform: You can talk to your AI character on web, terminal and mobile(Yes. we open source our mobile app)
Most up-to-date AI: We use the most up-to-date AI technology to power your AI character, including OpenAI, Anthropic Claude 2, Chroma, Whisper, ElevenLabs, etc.
Modular: You can easily swap out different modules to customize your flow. Less opinionated, more flexible. Great project to start your AI Engineering journey.

🔬 Tech stack

✅Web: React JS, Vanilla JS, WebSockets
✅Mobile: Swift, WebSockets
✅Backend: FastAPI, SQLite, Docker
✅Data Ingestion: LlamaIndex, Chroma
✅LLM Orchestration: LangChain, Chroma
✅LLM: ReByte, OpenAI GPT3.5/4, Anthropic Claude 2, Anyscale Llama2
✅Speech to Text: Local WhisperX, Local Whisper, OpenAI Whisper API, Google Speech to Text
✅Text to Speech: ElevenLabs, Edge TTS, Google Text to Speech
✅Voice Clone: ElevenLabs

📚 Comparison with existing products

📀 Quick Start - Installation via Docker

Create a new .env file
```
cp .env.example .env
```
Paste your API keys in .env file. A single ReByte or OpenAI API key is enough to get started.

You can also configure other API keys if you have them.
Start the app with docker-compose.yaml
```
docker compose up
```
If you have issues with docker (especially on a non-Linux machine), please refer to https://docs.docker.com/get-docker/ (installation) and https://docs.docker.com/desktop/troubleshoot/overview/ (troubleshooting).
Open http://localhost:3000 and enjoy the app!

💿 Developers - Installation via Python

Step 1. Clone the repo

git clone https://github.com/Shaunwei/RealChar.git && cd RealChar

Step 2. Install requirements

Install portaudio and ffmpeg for audio
```
# for mac
brew install portaudio
brew install ffmpeg
```
```
# for ubuntu
sudo apt update
sudo apt install portaudio19-dev
sudo apt install ffmpeg
```
Note:
- ffmpeg>=4.4 is needed to work with torchaudio>=2.1.0
- Mac users may need to add ffmpeg library path to DYLD_LIBRARY_PATH for torchaudio to work:
```
export DYLD_LIBRARY_PATH=/opt/homebrew/lib:$DYLD_LIBRARY_PATH
```
Then install all python requirements
```
pip install -r requirements.txt
```
If you need a faster local speech to text, install whisperX
```
pip install git+https://github.com/m-bain/whisperx.git
```
Step 3. Create an empty sqlite database if you have not done so before
```
sqlite3 test.db "VACUUM;"
```
Step 4. Run db upgrade
```
alembic upgrade head
```
This ensures your database schema is up to date. Please run this after every time you pull the main branch.
Step 5. Setup .env:
```
cp .env.example .env
```
Update API keys and configs following the instructions in the .env file.

Note that some features require a working login system. You can get your own OAuth2 login for free with Firebase if needed. To enable, set USE_AUTH to true and fill in the FIREBASE_CONFIG_PATH field. Also fill in Firebase configs in client/next-web/.env.

Step 6. Run backend server with cli.py or use uvicorn directly

python cli.py run-uvicorn
# or
uvicorn realtime_ai_character.main:app

Step 7. Run frontend client:
- web client:
  
  Create an .env file under client/next-web/
```
cp client/next-web/.env.example client/next-web/.env
```
  Adjust .env according to the instruction in client/next-web/README.md.
  
  Start the frontend server:
```
python cli.py next-web-dev
# or
cd client/next-web && npm run dev
# or
cd client/next-web && npm run build && npm run start
```
  After running these commands, a local development server will start, and your default web browser will open a new tab/window pointing to this server (usually http://localhost:3000).
- (Optional) Terminal client:
  
  Run the following command in your terminal
```
python client/cli.py
```
- (Optional) mobile client:
  
  open client/mobile/ios/rac/rac.xcodeproj/project.pbxproj in Xcode and run the app
Step 8. Select one character to talk to, then start talking. Use GPT4 for better conversation and Wear headphone for best audio(avoid echo)

Note if you want to remotely connect to a RealChar server, SSL set up is required to establish the audio connection.

👨‍🚀 API Keys and Configurations

1. LLMs

1.1 ReByte API Key

To get your ReByte API key, follow these steps:

Go to the ReByte website and sign up for an account if you haven't already.
Once you're logged in, go to Settings > API Keys.
Generate a new API key by clicking on the "Generate" button.

1.2 (Optional) OpenAI API Token

👇click me

This application utilizes the OpenAI API to access its powerful language model capabilities. In order to use the OpenAI API, you will need to obtain an API token.

To get your OpenAI API token, follow these steps:

Go to the OpenAI website and sign up for an account if you haven't already.
Once you're logged in, navigate to the API keys page.
Generate a new API key by clicking on the "Create API Key" button.

(Optional) To use Azure OpenAI API instead, refer to the following section:

Set API type in your .env file: OPENAI_API_TYPE=azure

If you want to use the earlier version 2023-03-15-preview:

OPENAI_API_VERSION=2023-03-15-preview

To set the base URL for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource.

OPENAI_API_BASE=https://your-base-url.openai.azure.com

To set the OpenAI model deployment name for your Azure OpenAI resource.

OPENAI_API_MODEL_DEPLOYMENT_NAME=gpt-35-turbo-16k

To set the OpenAIEmbeddings model deployment name for your Azure OpenAI resource.

OPENAI_API_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002

1.3 (Optional) Anthropic(Claude 2) API Token

👇click me

To get your Anthropic API token, follow these steps:

Go to the Anthropic website and sign up for an account if you haven't already.
Once you're logged in, navigate to the API keys page.
Generate a new API key by clicking on the "Create Key" button.

1.4 (Optional) Anyscale API Token

👇click me

To get your Anyscale API token, follow these steps:

Go to the Anyscale website and sign up for an account if you haven't already.
Once you're logged in, navigate to the Credentials page.
Generate a new API key by clicking on the "Generate credential" button.

2. Speech to Text

We support faster-whisper and whisperX as the local speech to text engines. Work with CPU and NVIDIA GPU.

2.1 (Optional) Google Speech-to-Text API

👇click me

To get your Google Cloud API credentials.json, follow these steps:

Go to the GCP website and sign up for an account if you haven't already.
Follow the guide to create a project and enable Speech to Text API
Put google_credentials.json in the root folder of this project. Check Create and delete service account keys
Change SPEECH_TO_TEXT_USE to use GOOGLE in your .env file

2.2 (Optional) OpenAI Whisper API

👇click me

Same as OpenAI API Token

3. Text to Speech

Edge TTS is the default and is free to use.

3.1 (Optional) ElevenLabs API Key

👇click me

Creating an ElevenLabs Account

Visit ElevenLabs to create an account. You'll need this to access the text to speech and voice cloning features.
In your Profile Setting, you can get an API Key.

3.2 (Optional) Google Text-to-Speech API

👇click me

To get your Google Cloud API credentials.json, follow these steps:

Go to the GCP website and sign up for an account if you haven't already.
Follow the guide to create a project and enable Text to Speech API
Put google_credentials.json in the root folder of this project. Check Create and delete service account keys

(Optional) 🔥 Create Your Own Characters

👇click me

Create Characters Locally

see realtime_ai_character/character_catalog/README.md

Create Characters on ReByte.ai

see docs/rebyte_agent_clone_instructions.md

(Optional) ☎️ Twilio Integration

👇click me

To use Twilio with RealChar, you need to set up a Twilio account. Then, fill in the following environment variables in your .env file:

TWILIO_ACCOUNT_SID=YOUR_TWILIO_ACCOUNT_SID
TWILIO_ACCESS_TOKEN=YOUR_TWILIO_ACCESS_TOKEN
DEFAULT_CALLOUT_NUMBER=YOUR_PHONE_NUMBER

You'll also need to install torch and torchaudio to use Twilio.

Now, you can receive phone calls from your characters by typing /call YOURNUMBER in the text box when chatting with your character.

Note: only US phone numbers and Elevenlabs voiced characters are supported at the moment.

🆕! Anyscale and LangSmith integration

👇click me

Anyscale

You can now use Anyscale Endpoint to serve Llama-2 models in your RealChar easily! Simply register an account with Anyscale Endpoint. Once you get the API key, set this environment variable in your .env file:

ANYSCALE_ENDPOINT_API_KEY=<your API Key>

By default, we show the largest servable Llama-2 model (70B) in the Web UI. You can change the model name (meta-llama/Llama-2-70b-chat-hf) to other models, e.g. 13b or 7b versions.

LangSmith

If you have access to LangSmith, you can edit these environment variables to enable:

LANGCHAIN_TRACING_V2=false # default off
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=YOUR_LANGCHAIN_API_KEY
LANGCHAIN_PROJECT=YOUR_LANGCHAIN_PROJECT

And it should work out of the box.

📍 Roadmap

Launch v0.0.4
Create a new character via web UI
Lower conversation latency
Support Twilio
Support ReByte
Persistent conversation*
Session management*
Support RAG*
Support Agents/GPTs*
Add additional TTS service*

$*$ These features are powered by ReByte platform.

🫶 Contribute to RealChar

Please check out our Contribution Guide!

💪 Contributors

🎲 Community

Join us on Discord

RealChar — общайтесь с AI-персонажами в реальном времени

RealChar — open-source проект, который позволяет создавать и настраивать AI-персонажей и общаться с ними голосом или текстом в реальном времени. Поддерживается работа через веб, терминал и мобильное приложение (iOS).

Демо: попробовать можно на RealChar.ai.

Основные возможности

Простота: не нужно уметь программировать, чтобы создать своего персонажа.
Настройка: можно задать характер, историю персонажа и даже голос.
Режим реального времени: разговор голосом или переписка.
Мультиплатформенность: веб, терминал, мобильное приложение.
Современные AI-модели: OpenAI GPT-3.5/4, Anthropic Claude 2, Anyscale Llama 2 и др.
Модульность: компоненты можно заменять — хорошая основа для изучения AI-инженерии.

Технологии

Веб: React, Vanilla JS, WebSockets.
Мобильное приложение: Swift, WebSockets.
Бэкенд: FastAPI, SQLite, Docker.
Обработка данных: LlamaIndex, Chroma.
LLM-оркестрация: LangChain, Chroma.
Распознавание речи: локальный WhisperX/Whisper, OpenAI Whisper API, Google Speech-to-Text.
Синтез речи: ElevenLabs, Edge TTS, Google Text-to-Speech.
Клонирование голоса: ElevenLabs.

Быстрый старт через Docker

Скопируйте файл с переменными окружения:
```
cp .env.example .env
```
Вставьте в .env минимум один API-ключ — от ReByte или OpenAI. Остальные (ElevenLabs, Google, Anthropic и т.д.) подключаются по желанию.
Запустите приложение:
```
docker compose up
```
Откройте в браузере http://localhost:3000.

Проблемы с Docker (особенно на non-Linux) — см. официальную документацию по установке и решению проблем.

Установка для разработчиков (без Docker)

Клонируйте репозиторий:

git clone https://github.com/Shaunwei/RealChar.git && cd RealChar

Установите системные зависимости для аудио: portaudio и ffmpeg (версия >= 4.4).
- macOS: brew install portaudio ffmpeg
- Ubuntu: sudo apt install portaudio19-dev ffmpeg
Установите Python-зависимости:
```
pip install -r requirements.txt
```
Для ускоренного локального распознавания речи доустановите WhisperX:
```
pip install git+https://github.com/m-bain/whisperx.git
```
Создайте пустую базу SQLite и примените миграции:
```
sqlite3 test.db "VACUUM;"
alembic upgrade head
```
Выполняйте alembic upgrade head после каждого обновления ветки main.
Настройте .env (скопируйте .env.example и заполните ключи).

Для работы некоторых функций (например, логина) потребуется OAuth2 от Firebase — установите USE_AUTH=true и укажите путь к файлу конфигурации Firebase.
Запустите бэкенд:
```
python cli.py run-uvicorn
```
или uvicorn realtime_ai_character.main:app.
Запустите веб-клиент:
- Перейдите в client/next-web/, скопируйте .env.example в .env и отредактируйте.
- Запустите:
```
python cli.py next-web-dev
```
  или cd client/next-web && npm run dev.
Откройте http://localhost:3000, выберите персонажа и начинайте общение.

Совет: для наилучшего качества речи используйте GPT-4 и разговаривайте в наушниках (чтобы избежать эха). Для удалённого подключения к серверу RealChar потребуется настроить SSL.

API-ключи и конфигурация

LLM

ReByte (основной): зарегистрируйтесь на ReByte, в настройках создайте API-ключ.

OpenAI (опционально): получите ключ на OpenAI. Для Azure OpenAI установите в .env:

OPENAI_API_TYPE=azure
OPENAI_API_BASE=https://your-base-url.openai.azure.com
OPENAI_API_MODEL_DEPLOYMENT_NAME=gpt-35-turbo-16k
OPENAI_API_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002

Anthropic Claude 2 (опционально): ключ получите в консоли Anthropic.
Anyscale Llama 2 (опционально): зарегистрируйтесь на Anyscale Endpoints и создайте ключ.

Распознавание речи

По умолчанию используется локальный faster-whisper/WhisperX (работает на CPU и NVIDIA GPU).

Google Speech-to-Text: создайте проект в GCP, включите API, скачайте google_credentials.json и положите в корень проекта. В .env установите SPEECH_TO_TEXT_USE=GOOGLE.
OpenAI Whisper API: используйте тот же ключ OpenAI.

Синтез речи

По умолчанию — бесплатный Edge TTS.

ElevenLabs: для качественного синтеза и клонирования голоса зарегистрируйтесь на ElevenLabs и получите API-ключ в настройках профиля.
Google Text-to-Speech: аналогично Google Speech-to-Text (отдельный проект и ключи).

Создание собственных персонажей

Локально: см. инструкцию в realtime_ai_character/character_catalog/README.md.
Через ReByte.ai: см. docs/rebyte_agent_clone_instructions.md.

Интеграция с Twilio (телефонные звонки)

Поддерживаются US-номера и голоса ElevenLabs. Заполните в .env:

TWILIO_ACCOUNT_SID=YOUR_TWILIO_ACCOUNT_SID
TWILIO_ACCESS_TOKEN=YOUR_TWILIO_ACCESS_TOKEN
DEFAULT_CALLOUT_NUMBER=YOUR_PHONE_NUMBER

Потребуется установить torch и torchaudio. Во время чата с персонажем отправьте команду /call YOURNUMBER, и персонаж позвонит вам.

Ссылки

Источник: https://mcpmarket.com/server/realchar

Ключевые возможности

Сценарии использования

RealChar. - Your Realtime AI Character

✨ Demo

Demo 1 - with Santa Claus!

Demo 2 - with AI Elon about cage fight!

Demo 3 - with AI Raiden about AI and "real" memory

🎯 Key Features

🔬 Tech stack

📚 Comparison with existing products

📀 Quick Start - Installation via Docker

💿 Developers - Installation via Python

👨‍🚀 API Keys and Configurations

1. LLMs

1.1 ReByte API Key

1.2 (Optional) OpenAI API Token

1.3 (Optional) Anthropic(Claude 2) API Token

1.4 (Optional) Anyscale API Token

2. Speech to Text

2.1 (Optional) Google Speech-to-Text API

2.2 (Optional) OpenAI Whisper API

3. Text to Speech

3.1 (Optional) ElevenLabs API Key

3.2 (Optional) Google Text-to-Speech API

(Optional) 🔥 Create Your Own Characters

Create Characters Locally

Create Characters on ReByte.ai

(Optional) ☎️ Twilio Integration

🆕! Anyscale and LangSmith integration

Anyscale

LangSmith

📍 Roadmap

🫶 Contribute to RealChar

💪 Contributors

🎲 Community

RealChar — общайтесь с AI-персонажами в реальном времени

Основные возможности

Технологии

Быстрый старт через Docker

Установка для разработчиков (без Docker)

API-ключи и конфигурация

LLM

Распознавание речи

Синтез речи

Создание собственных персонажей

Интеграция с Twilio (телефонные звонки)

Ссылки

Можно ли настроить характеристики моего ИИ-персонажа?

Нужны ли навыки программирования для использования RealChar?

Какие передовые ИИ-технологии использует RealChar?

Что такое RealChar?

Какие платформы поддерживает RealChar?

Комментарии