Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chobits

Warning

This project is being developed,all the things is not stable.

build-serverdocker

Purpose

  • To learn the rust programming language,voice interaction and large language model.

  • To make an self contained chatbot(self host all component,eg: llm,tts etc..), like xiaozhi-esp32 with self host server.

Further information

Looking for an overview of the interface? Check it out!

Login/Register Page

TODO

User Dashboard

TODO

Feature

  • Connection: websocket
  • Voice interaction: VAD,ASR,TTS
  • Chat: LLM
  • MCP: self host/remote server mcp,device mcp
  • Backend
    1. home page(WIP)
    2. admin console(WIP)
    3. simulation deivce in web(WIP)
  • Deploy: bin(WIP),docker(WIP)
  • Compatible devices
    1. xiaozhi-esp32(WIP)
    2. chobits(cross platform app,create by flutter)(WIP)

System Requirements

TODO

Guide

TODO

QuickStart

Development

apps/server

pnpm i
pnpm exec nx run @chobits/server-ui:build
./apps/server/script/download_model.sh
# using cuda: pnpm nx run chobits-server:dev_cuda
pnpm nx run chobits-server:dev

apps/server-ui

pnpm i
pnpm exec nx run @chobits/server-ui:dev

apps/app

TODO

Building

TODO

Using

TODO

Development

File structure

TODO

Chat Flow

flowchart TB
  subgraph Device
    direction TB
    DeviceSession[Device Session] --> DeviceMCPServer[Device MCP Server]
    DeviceMCPServer .-> DeviceSession
  end
  WebSocket
    subgraph Server
    direction LR
    ServerSession[Server Session]
    ServerMCPHost[Server MCP Host]
    ServerMCPClient[Server MCP Client]
    ServerMCPServer[Server MCP Server]
    RemoteServerMCPServer[Remote Server MCP Server]
    VAD
    ASR
    LLM
    TTS

    ServerSession --> ServerMCPHost
    ServerMCPHost --> ServerMCPClient
    ServerMCPClient --> ServerMCPServer
    ServerMCPServer .-> ServerMCPClient
    ServerMCPClient --> RemoteServerMCPServer
    RemoteServerMCPServer .-> ServerMCPClient
    ServerMCPClient .-> ServerMCPHost
    ServerMCPHost .-> ServerSession

    ServerSession --> VAD
    VAD --> ASR
    ASR --> LLM
    LLM --> ServerMCPHost
    ServerMCPHost .-> LLM
    LLM --> TTS
    TTS .-> ServerSession
  end
  subgraph Transport
    WebSocket
  end

  DeviceSession <--> WebSocket
  WebSocket <--> ServerSession

握手阶段

sequenceDiagram
    autonumber
    Device Session ->> Server Session: 1. websocket connect request
    Server Session -->> Device Session: 2. websocket connect response
    Device Session ->> Server Session: 3. hello message request
    Server Session -->> Device Session: 4. hello message response
    alt Hello message response has mcp = true
        Server Session ->> Device Session: 5. mcp initialize message request
        Device Session -->> Server Session: 6. mcp initialize message response
        Server Session ->> Device Session: 7. mcp tools list message request
        Device Session -->> Server Session: 8. mcp tools list message response
        loop Tools list message response has next cursor
            Server Session ->> Device Session: 7. mcp tools list message request
            Device Session -->> Server Session: 8. mcp tools list message response
        end
    end

通讯阶段

sequenceDiagram
    autonumber
    participant DeviceSession as Device Session
    participant ServerSession as Server Session
    DeviceSession ->> ServerSession: audio data
    DeviceSession ->> ServerSession: listen(detect) message
    ServerSession -->> DeviceSession: stt message
    DeviceSession ->> ServerSession: listen(start) message
    loop
      DeviceSession ->> ServerSession: audio data
      break when no voice timeout
        ServerSession ->> DeviceSession: disconnect
      end
      par
        ServerSession ->> ServerSession: vad handle
        opt if voice silence timeout
          ServerSession ->> ServerSession: send main handle stop single
        end
      and
        opt if voice silence timeout
          note right of ServerSession: when recv main handle stop single to exit following logic
          ServerSession ->> ServerSession: asr handle
          ServerSession ->> ServerSession: llm handle
          loop if last llm messages is tools call response
            ServerSession ->> ServerSession: mcp handle
            ServerSession ->> ServerSession: llm handle
          end
          loop
            ServerSession -->> DeviceSession: llm message
            ServerSession -->> DeviceSession: tts(start) message
            ServerSession -->> DeviceSession: tts(sentence start) message
            ServerSession -->> DeviceSession: audio data
            ServerSession -->> DeviceSession: tts(sentence end) message
            ServerSession -->> DeviceSession: tts(stop) message
          end
        end
      end
    end

MCP handle

sequenceDiagram
    autonumber
    participant DeviceSession as Device Session
    participant ServerSession as Server Session
    participant ServerMCPServer as Server MCP Server
    alt if call device tool
      ServerSession ->> DeviceSession: mcp tools call message request
      DeviceSession -->> ServerSession: mcp tools call message response
    else if call server tool
      ServerSession ->> ServerMCPServer: mcp tools call http request
      ServerMCPServer -->> ServerSession: mcp tools call http response
    end

API reqeust and response

  • websocket connect request

    在建立 WebSocket 连接时,代码示例中设置了以下请求头:

    • Authorization: 用于存放访问令牌,形如 "Bearer <token>"
    • Protocol-Version: 协议版本号,与 hello 消息体内的 version 字段保持一致
    • Device-Id: 设备物理网卡 MAC 地址
    • Client-Id: 软件生成的 UUID(擦除 NVS 或重新烧录完整固件会重置)

    这些头会随着 WebSocket 握手一起发送到服务器,服务器可根据需求进行校验、认证等。

  • websocket connect response

  • hello message request

    {
      "type": "hello",
      "version": 1,
      "features": {
        "mcp": true
      },
      "transport": "websocket",
      "audio_params": {
        "format": "opus",
        "sample_rate": 16000,
        "channels": 1,
        "frame_duration": 60
      }
    }
    
  • hello message response

    {
      "type": "hello",
      "transport": "websocket",
      "session_id": "xxx",
      "audio_params": {
        "format": "opus",
        "sample_rate": 24000,
        "channels": 1,
        "frame_duration": 60
      }
    }
    
  • mcp initialize message request

    {
      "jsonrpc": "2.0",
      "method": "initialize",
      "params": {
        "capabilities": {
          // 客户端能力,可选
        }
      },
      "id": 1 // 请求 ID
    }
    
  • mcp initialize message response

    {
      "jsonrpc": "2.0",
      "id": 1, // 匹配请求 ID
      "result": {
        "protocolVersion": "2024-11-05",
        "capabilities": {
          "tools": {} // 这里的 tools 似乎不列出详细信息,需要 tools/list
        },
        "serverInfo": {
          "name": "...", // 设备名称 (BOARD_NAME)
          "version": "..." // 设备固件版本
        }
      }
    }
    
  • mcp tools list message request

    {
      "jsonrpc": "2.0",
      "method": "tools/list",
      "params": {
        "cursor": "" // 用于分页,首次请求为空字符串
      },
      "id": 2 // 请求 ID
    }
    
  • mcp tools list message response

    {
      "jsonrpc": "2.0",
      "id": 2, // 匹配请求 ID
      "result": {
        "tools": [ // 工具对象列表
          {
            "name": "self.get_device_status",
            "description": "...",
            "inputSchema": { ... } // 参数 schema
          },
          {
            "name": "self.audio_speaker.set_volume",
            "description": "...",
            "inputSchema": { ... } // 参数 schema
          }
          // ... 更多工具
        ],
        "nextCursor": "..." // 如果列表很大需要分页,这里会包含下一个请求的 cursor 值
      }
    }
    
  • listen message

    {
      "session_id": "xxx",
      "type": "listen",
      "state": "start",
      "mode": "manual"
    }
    
    • “session_id”:会话标识
    • “type”: “listen”
    • “state”:“start”, “stop”, “detect”(唤醒检测已触发)
    • “mode”:“auto”, “manual” 或 “realtime”,表示识别模式。
  • stt message

    {
      "session_id": "xxx",
      "type": "stt",
      "text": "..."
    }
    
    • 表示服务器端识别到了用户语音。(例如语音转文本结果)
    • 设备可能将此文本显示到屏幕上,后续再进入回答等流程。
  • llm message

    {
      "session_id": "xxx",
      "type": "llm",
      "emotion": "happy",
      "text": "😀"
    }
    
    • 服务器指示设备调整表情动画 / UI 表达。
  • mcp tools call message request

    {
      "jsonrpc": "2.0",
      "method": "tools/call",
      "params": {
        "name": "self.audio_speaker.set_volume", // 要调用的工具名称
        "arguments": {
          // 工具参数,对象格式
          "volume": 50 // 参数名及其值
        }
      },
      "id": 3 // 请求 ID
    }
    
  • mcp tools call message response

    {
      "jsonrpc": "2.0",
      "id": 3, // 匹配请求 ID
      "result": {
        "content": [
          // 工具执行结果内容
          { "type": "text", "text": "true" } // 示例:set_volume 返回 bool
        ],
        "isError": false // 表示成功
      }
    }
    
    • 设备成功响应消息
    {
      "jsonrpc": "2.0",
      "id": 3, // 匹配请求 ID
      "error": {
        "code": -32601, // JSON-RPC 错误码,例如 Method not found (-32601)
        "message": "Unknown tool: self.non_existent_tool" // 错误描述
      }
    }
    
    • 设备失败响应消息
  • tts message

    {
      "session_id": "xxx",
      "type": "tts",
      "state": "start"
    }
    
    • 服务器准备下发 TTS 音频,设备端进入 “speaking” 播放状态。
    {
      "session_id": "xxx",
      "type": "tts",
      "state": "stop"
    }
    
    • 表示本次 TTS 结束。
    {
      "session_id": "xxx",
      "type": "tts",
      "state": "sentence_start",
      "text": "..."
    }
    
    • 让设备在界面上显示当前要播放或朗读的文本片段(例如用于显示给用户)。

Server

File structure

TODO

Data flow

TODO

Model

LLM

ModelMemoryFile SizeRemark
unsloth/Qwen3-1.7B-GGUF2.5GB1.11GBQwen3-1.7B-Q4_K_M.gguf

ASR

ModelMemoryFile SizeRemark
openai/whisper-tiny0.45GB0.15GB
openai/whisper-small1.1GB0.96GB
Qwen/Qwen3-ASR-0.6B2GB1.88GB
openai/whisper-large-v3-turbo4GB1.62GB

TTS

ModelMemoryFile SizeRemark
mzdk100/kokoro0.12GB0.37GB
openbmb/VoxCPM-0.5B2GB1.61GB

CUDA Toolkit install in fedora 43 & setup env

sudo sh cuda_12.8.1_570.124.06_linux.run --toolkit --no-drm --silent --override
export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda/lib64

``` shell
conda create -n cuda
# for candle library
conda install conda-forge::gcc==14.3.0
conda install conda-forge::gxx==14.3.0
# for openssl library
conda install anaconda::openssl
conda activate cuda
# develop...

Spec

https://rust-lang.github.io/api-guidelines/

https://rust-coding-guidelines.github.io/rust-coding-guidelines-zh/overview.html

ESP32

Checkout

git clone git@github.com:78/xiaozhi-esp32.git

Install ESP IDF

https://docs.espressif.com/projects/esp-idf/zh_CN/v5.5.2/esp32/get-started/linux-macos-setup.html

Development

Setup enviroment and flash device

  • esp32-s3
. $HOME/esp/esp-idf/export.sh
idf.py set-target esp32-s3
idf.py menuconfig
idf.py build
idf.py -p PORT flash
# macos
idf.py -p /dev/cu.usbserial-14410 flash
# linux
sudo chmod 777 /dev/ttyACM0
idf.py -p /dev/ttyACM0 flash

Other useful command

  • Get PORT
ls /dev/cu.*
  • Debug monitor
idf.py monitor
idf.py -p PORT flash monitor

App

Warning

App not implement any chobits feature right now,just make a basic framework.

Framework

  • Theme framework
  • Adaptive
    • Desktop or not
  • Text scale
  • I18n
  • Network
    • Dio
  • Database
    • Sqlite
  • Util
    • Unique Id
      • nanoid2
  • Event Bus
  • Auto upgrade
    • Android
  • Logging
    • Rotating log File(Without Web Env)
    • Export log file or upload log file
  • Release
    • Android
    • IOS
    • Windows
      • exe(unpack)
    • Linux
    • MacOS
    • Web
  • Env Config
    • Dev
    • Prod

Advance

  • Auth
    • Spring-authorization-server
  • User Profile

Develop Flow

  • Changelog
  • CI
    • Build
    • Code Quality
    • Test
  • Testing
    • Unit Test(Example)
    • Widget Test(Example)
    • Integration Test

Coding

Database Versioning

lib\modules\app\app_store.dart

//DB init
DbManager.instance().init([ChangeLogV1()]);
Database db = await DbManager.instance().open();

Database Record To View Model

Example

class MemoMapper{
    static Future<List<MemoEntity>> selectAll() async {
        List<Map<String, dynamic>> findResult =
            await DbManager.instance().find(MemoEntity.tableName);
        return Future.value(findResult.map((e) => MemoEntity.fromJson(e)).toList());
  }
}

class AppStore{
    Future<List<MemoModel>> getMemoList() async {
        List<MemoEntity> memoEntityList = await MemoMapper.selectAll();
        return Future(() =>
            memoEntityList.map((e) => MemoModel.fromJson(e.toJson())).toList());
      }
}

class _MemoPageState{

    void initState() {
        super.initState();
        Provider.of<AppStore>(context, listen: false).getMemoList().then((value) {
          setState(() {
            memoModelList = value;
          });
        });
      }
}

Pagging(Pull refresh and load more)

Example

class _MemoPageState extends State<MemoPage> {
  List<MemoModel> _memoModelList = [];
  late EasyRefreshController _controller;
  late int _pageNum;
  late int _pageSize;
  late int _total;

  @override
  void initState() {
    super.initState();
    resetPageInfo();
    _controller = EasyRefreshController(
      controlFinishRefresh: true,
      controlFinishLoad: true,
    );
  }

  @override
  void dispose() {
    _controller.dispose();
    super.dispose();
  }

  void resetPageInfo() {
    setState(() {
      _pageNum = 1;
      _pageSize = 10;
      _total = 0;
      _memoModelList = [];
    });
  }

  Future<PageResult> _loadData() async {
    var result = await Provider.of<AppStore>(context, listen: false).pageMemo(
        PageParam(
            pageNum: _pageNum, pageSize: _pageSize, orderBy: " datetime desc"));
    if (!mounted) {
      return Future.value(result);
    }
    _memoModelList.addAll(result.rows);
    _total = result.total;
    LogHelper.debug(
        "[Memo] loadData pageNum = $_pageNum,pageSize = $_pageSize,total = $_total");
    return Future.value(result);
  }

  void _refreshList() async {
    resetPageInfo();
    await _loadData();
    setState(() {});
    _controller.finishRefresh();
    _controller.resetFooter();
  }

  void _loadList() async {
    _pageNum++;
    PageResult result = await _loadData();
    setState(() {});
    if (result.hasNext) {
      _controller.finishLoad(IndicatorResult.success);
    } else {
      _controller.finishLoad(IndicatorResult.noMore);
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
        body: EasyRefresh(
            refreshOnStart: true,
            controller: _controller,
            header: RefreshHeader(context),
            footer: RefreshFooter(context),
            onRefresh: _refreshList,
            onLoad: _loadList,
            child: CustomScrollView(slivers: [
              SliverGrid(
                delegate: SliverChildBuilderDelegate((context, index) {
                  return _createMemoItem(_memoModelList[index]);
                }, childCount: _memoModelList.length),
                gridDelegate: const SliverGridDelegateWithMaxCrossAxisExtent(
                    maxCrossAxisExtent: 210),
              )
            ])),
        floatingActionButton: _getFloatingActionButton());
  }
}

HttpRequest

Response result = await HttpClient.instance().get("/");
LogHelper.info(result.body().toString());

JSON Model Gen

dart run build_runner build

Release

Android

Build and release an Android app | Flutter

  1. Create an keystore

    keytool -genkey -v -keystore .\android-app-keystore.jks -storetype JKS -keyalg RSA -keysize 2048 -validity 10000 -alias android-app
    
  2. create [project]/android/key.properties and reference the keystore from the app

    storePassword=<password from previous step>
    keyPassword=<password from previous step>
    keyAlias=android-app
    storeFile=<location of the key store file, such as /Users/<user name>/android-app-keystore.jks or C:\\Users\\<user name>\\android-app-keystore.jks>
    
  3. run release command(eg: prod env)

    flutter build apk --dart-define=DART_DEFINE_APP_ENV=prod
    

Conventional Commits

https://github.com/angular/angular/blob/main/CONTRIBUTING.md#-commit-message-format

https://www.conventionalcommits.org/en/v1.0.0/

https://www.conventionalcommits.org/zh-hans/v1.0.0/

The Conventional Commits specification is a lightweight convention on top of commit messages. It provides an easy set of rules for creating an explicit commit history; which makes it easier to write automated tools on top of. This convention dovetails with SemVer, by describing the features, fixes, and breaking changes made in commit messages.

The commit message should be structured as follows:


<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

The commit contains the following structural elements, to communicate intent to the consumers of your library:

  1. fix: a commit of the type fix patches a bug in your codebase (this correlates with PATCH in Semantic Versioning).
  2. feat: a commit of the type feat introduces a new feature to the codebase (this correlates with MINOR in Semantic Versioning).
  3. BREAKING CHANGE: a commit that has a footer BREAKING CHANGE:, or appends a ! after the type/scope, introduces a breaking API change (correlating with MAJOR in Semantic Versioning). A BREAKING CHANGE can be part of commits of any type.
  4. types other than fix: and feat: are allowed, for example @commitlint/config-conventional (based on the Angular convention) recommends build:, chore:, ci:, docs:, style:, refactor:, perf:, test:, and others.
  5. footers other than BREAKING CHANGE: <description> may be provided and follow a convention similar to git trailer format.

Additional types are not mandated by the Conventional Commits specification, and have no implicit effect in Semantic Versioning (unless they include a BREAKING CHANGE). A scope may be provided to a commit’s type, to provide additional contextual information and is contained within parenthesis, e.g., feat(parser): add ability to parse arrays.

Sqlite3 on web

sqflite_common_ffi_web

Setup binaries

Implementation requires sqlite3.wasm binaries into your web folder as well as a sqflite specific shared worker.

You can install binaries using the command:

dart run sqflite_common_ffi_web:setup

It should create the following files in your web folder:

  • sqlite3.wasm
  • sqflite_sw.js

that you can put in source control or not (personally I don’t)

Note: when sqlite3 and its wasm binary are updated, you may need to run the command again using the force option:

dart run sqflite_common_ffi_web:setup --force

Offline Map

  1. Generate XYZ tiles(Directory)

    QGIS->Toolbox->Generate XYZ tiles(Directory)

  2. Gen pubspec.yaml assets path

ls -R assets | grep ':'

Coordinate

  1. Server save the coordinate format is WGS84

  2. Client map is gaode map,who’s format is GCJ-02;

  3. So we must transform the coordinate from WGS84 to GCJ-02 and we use the util coordtransform_dart.

Q&A

Q1. setState() or markNeedsBuild() called during build. This ModelBinding widget cannot be marked as needing to build because the framework is already in the process of building widgets.

https://fluttercorner.com/setstate-or-markneedsbuild-called-during-build-a-vertical-renderflex-overflowed/

Solution 1: use a call back function You just need to use a call back function. Because Should be setState method call before the build method had completed the process of building the widgets and thats why you are facing this error.

WidgetsBinding.instance.addPostFrameCallback((_){

// Your Code Here

});

Other

  1. pod install slow
cd ./ios/
#如有clash这类代理软件,则执行下面代理设置命令,使用代理进行依赖库的下载
#export https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890
pod install --verbose

Relate Project

FAQ

  1. Why project name is chobits?

    It’s a dream about chobits and Planetarian: The Reverie of a Little Planet