新增 component-audio-player 技能：组件音频导出播放器

2026-06-26 11:57:57 +08:00 · 2026-06-26 11:57:57 +08:00 · cf9f0411e0
commit cf9f0411e0
parent 74f1c392c4
3 changed files with 190 additions and 1 deletions
--- a/.vala_skill_hashes
+++ b/.vala_skill_hashes
@ -11,4 +11,8 @@ studytime-analysis fefb11a0c2fb7085a47c626ec6b72f8fcafee797dc3340abea09139d31eb7
 studycourse-analysis 467051001a8a087aa0526f0102593e0b0ed563cb4627f5f660dc718efc29699b
 user-info 0bb7007cbb9fc7659be1bf64f4f79418fbd25434dc61e8c271103cec82a2a759
 douyin-live-analyst 6459a76cab6e81655ac14691309b4aec816c1e949b0b2e1f8de2e081895403de
-export-user-data 2cb9de17ea0eac3da1073060321f66dfd32d654ac75de40ccdfef1d4bed552fe
+component-audio-player a4d6fa206d7e1d0468e9cb9815424d129df80e808507b6872bf7aaf715fdbef5
+export-user-data 05b32e97d98963348366f699977efce028c3f4698862297afa4bafbd395cbde8
+full-data-refresh 7ae21cd2542bb12a9fb970afad4f76660932ffb7c86374039f2c4399ad63431c
+lark-send-message-as-bot 2c57e4b2ae0c042b28a8d1262ca22105c1976bd03ae5a4bdeb86a2870af917e0
+wechat-article-dachen 933e7022e181e03287d574f988fa48e3bbcee5dcda1291b8b5643c02398d21fc
--- a/skills/component-audio-player/SKILL.md
+++ b/skills/component-audio-player/SKILL.md
@ -0,0 +1,39 @@
+---
+name: component-audio-player
+slug: component-audio-player
+version: 1.0.0
+description: Export user audio recordings from learning components and generate self-contained HTML playback pages with inline audio players.
+metadata:
+  openclaw:
+    requires:
+      bins: ["python3", "curl", "jq"]
+    categories: ["feishu", "data-export", "audio"]
+---
+
+# 组件音频导出播放器
+
+导出指定互动组件的用户录音，生成可播放 HTML 页面，每条记录包含参考文本、用户朗读文本、发音评分和 ▶ 播放按钮。
+
+## When to Use
+
+用户要求导出某组件的 Oops/Perfect 等录音记录，并希望直接听到用户声音时触发。
+
+## Core Rules
+
+1. **先跑导出脚本**：`scripts/export_component_records.sh --c-type <type> --c-id <id>` 获取全量数据
+2. **筛选目标记录**：按 `play_result` 筛选，按 `updated_at` 倒序取 Top N
+3. **音频不下载**：HTML 中 `<audio>` 直接引用 CDN URL（`static.valavala.com`），省去下载步骤
+4. **自包含 HTML**：单文件，浏览器打开即播，无需服务器
+5. **同时发 Excel**：方便用户做进一步数据分析
+6. **发送用 Bot**：通过 `lark-send-message-as-bot` 技能将 HTML + Excel 发给请求用户
+
+## Quick Reference
+
+| 内容 | 文件 |
+|------|------|
+| 详细步骤 | `workflow.md` |
+
+## Data Storage
+
+- 中间产物（Excel、HTML）写入 `output/` 目录
+- 不创建持久化音频缓存
--- a/skills/component-audio-player/workflow.md
+++ b/skills/component-audio-player/workflow.md
@ -0,0 +1,146 @@
+# 组件音频导出播放器 — 详细步骤
+
+## 步骤 ①：解析参数
+
+从用户消息中提取以下参数：
+
+| 参数 | 说明 | 示例 |
+|------|------|------|
+| `c_type` | 组件类型 | `mid_dialog_repeat` |
+| `c_id` | 组件 ID | `1101301` |
+| `play_result` | 筛选判定（默认 Oops） | `Oops` / `Perfect` / `Pass` / `Failed` |
+| `limit` | 取最近 N 条（默认 50） | `50` |
+
+若用户未明确 c_type，可通过组件 ID 在 MySQL `middle_interaction_component` 或 `core_interaction_component` 表中反查。
+
+## 步骤 ②：导出全量数据
+
+运行导出脚本：
+
+```bash
+cd /root/.openclaw/workspace-xiaoban
+bash scripts/export_component_records.sh --c-type <c_type> --c-id <c_id>
+```
+
+输出：`output/组件_<名称>_<c_id>_导出时间_<YYYYMMDD>.xlsx`
+
+包含字段：`user_id`、`session_id`、`component_unique_code`、`c_type`、`c_id`、组件名称、组件标题、`mode`、参考文本、`play_result`、发音评分、音频URL、朗读内容、`user_behavior_info`、`updated_at`
+
+## 步骤 ③：筛选目标记录
+
+```python
+import pandas as pd
+
+df = pd.read_excel('<步骤②输出的Excel>')
+target = df[df['play_result'] == '<play_result>'] \
+    .sort_values('updated_at', ascending=False) \
+    .head(<limit>)
+target.to_excel('output/<筛选后Excel>', index=False)
+```
+
+## 步骤 ④：提取音频信息
+
+音频 URL 和朗读内容已由导出脚本从 `user_behavior_info` JSON 中提取，直接使用 `音频URL`、`朗读内容`、`发音评分` 列即可。
+
+## 步骤 ⑤：生成 HTML 播放页
+
+生成自包含 HTML，每条记录一个卡片：
+
+- 卡片头部：序号、user_id、时间
+- 参考文本（绿色）
+- 用户朗读文本（红色）
+- 发音评分（如有）
+- `<audio>` 播放器，`src` 直接使用 CDN URL
+
+```python
+html = '''<!DOCTYPE html>
+<html lang="zh-CN">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>组件音频 - <c_type> <c_id></title>
+<style>
+* { margin: 0; padding: 0; box-sizing: border-box; }
+body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; background: #f5f5f5; padding: 20px; }
+h1 { text-align: center; margin-bottom: 10px; color: #333; }
+.subtitle { text-align: center; color: #666; margin-bottom: 20px; font-size: 14px; }
+.card { background: white; border-radius: 8px; padding: 16px; margin-bottom: 12px; box-shadow: 0 1px 3px rgba(0,0,0,0.1); }
+.card-header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 8px; flex-wrap: wrap; gap: 4px; }
+.card-num { font-weight: bold; color: #e74c3c; font-size: 14px; }
+.card-user { color: #666; font-size: 13px; }
+.card-time { color: #999; font-size: 12px; }
+.card-body { display: flex; flex-direction: column; gap: 6px; }
+.card-row { display: flex; gap: 10px; align-items: center; flex-wrap: wrap; }
+.label { font-size: 12px; color: #999; min-width: 60px; }
+.value { font-size: 14px; color: #333; }
+.ref { color: #27ae60; font-weight: 500; }
+.actual { color: #e74c3c; }
+audio { width: 100%; max-width: 400px; height: 36px; margin-top: 4px; }
+.no-audio { color: #999; font-style: italic; font-size: 13px; }
+.score { background: #f0f0f0; padding: 2px 8px; border-radius: 4px; font-size: 12px; }
+</style>
+</head>
+<body>
+<h1>🎙️ <组件名称> <c_id> — <play_result> 记录</h1>
+<p class="subtitle">组件: <组件标题> | 参考文本: "<参考文本>" | 最近<limit>条 · 时间倒序 | 点击 ▶ 直接听</p>
+'''
+
+for idx, (_, row) in enumerate(df.iterrows()):
+    audio_url = row['音频URL'] if pd.notna(row['音频URL']) else ''
+    content = row['朗读内容'] if pd.notna(row['朗读内容']) else ''
+    ref = row['参考文本'] if pd.notna(row['参考文本']) else ''
+    score = row['发音评分'] if pd.notna(row['发音评分']) else ''
+    ts = str(row['updated_at'])[:19]
+
+    html += f'''
+<div class="card">
+  <div class="card-header">
+    <span class="card-num">#{idx+1} {play_result}</span>
+    <span class="card-user">user_id: {row['user_id']}</span>
+    <span class="card-time">{ts}</span>
+  </div>
+  <div class="card-body">
+    <div class="card-row">
+      <span class="label">参考文本:</span>
+      <span class="value ref">{ref}</span>
+    </div>
+    <div class="card-row">
+      <span class="label">用户朗读:</span>
+      <span class="value actual">{content or '(无文本)'}</span>
+      {f'<span class="score">发音评分: {score}</span>' if score else ''}
+    </div>'''
+    if audio_url:
+        html += f'''
+    <audio controls preload="none">
+      <source src="{audio_url}" type="audio/wav">
+    </audio>'''
+    else:
+        html += '<div class="no-audio">⚠️ 无音频文件</div>'
+    html += '\n  </div>\n</div>'
+
+html += '\n</body>\n</html>'
+
+with open('output/<输出文件名>.html', 'w', encoding='utf-8') as f:
+    f.write(html)
+```
+
+## 步骤 ⑥：发送文件
+
+使用 `lark-send-message-as-bot` 技能，将 HTML 和 Excel 分别发送给请求用户：
+
+1. 获取 `tenant_access_token`
+2. 上传 HTML 文件（`file_type=stream`）
+3. 发送 HTML 文件消息（`receive_id_type=user_id`）
+4. 上传 Excel 文件（`file_type=xls`）
+5. 发送 Excel 文件消息
+
+目标 `user_id` 通过 `lark-identify-sender` 技能获取。
+
+## 依赖
+
+| 依赖 | 说明 |
+|------|------|
+| `scripts/export_component_records.sh` | 组件数据导出脚本 |
+| `lark-send-message-as-bot` | 飞书 Bot 消息发送 |
+| `lark-identify-sender` | 飞书用户身份识别 |
+| Python `pandas`、`openpyxl` | 数据处理和 Excel 读写 |