新增 component-audio-player 技能：组件音频导出播放器

2026-06-26 11:57:57 +08:00 · 2026-06-26 11:57:57 +08:00 · cf9f0411e0
commit cf9f0411e0
parent 74f1c392c4
3 changed files with 190 additions and 1 deletions
--- a/.vala_skill_hashes
+++ b/.vala_skill_hashes
@ -11,4 +11,8 @@ studytime-analysis fefb11a0c2fb7085a47c626ec6b72f8fcafee797dc3340abea09139d31eb7
 studycourse-analysis 467051001a8a087aa0526f0102593e0b0ed563cb4627f5f660dc718efc29699b
 user-info 0bb7007cbb9fc7659be1bf64f4f79418fbd25434dc61e8c271103cec82a2a759
 douyin-live-analyst 6459a76cab6e81655ac14691309b4aec816c1e949b0b2e1f8de2e081895403de
-export-user-data 2cb9de17ea0eac3da1073060321f66dfd32d654ac75de40ccdfef1d4bed552fe
+component-audio-player a4d6fa206d7e1d0468e9cb9815424d129df80e808507b6872bf7aaf715fdbef5
 export-user-data 05b32e97d98963348366f699977efce028c3f4698862297afa4bafbd395cbde8
 full-data-refresh 7ae21cd2542bb12a9fb970afad4f76660932ffb7c86374039f2c4399ad63431c
 lark-send-message-as-bot 2c57e4b2ae0c042b28a8d1262ca22105c1976bd03ae5a4bdeb86a2870af917e0
 wechat-article-dachen 933e7022e181e03287d574f988fa48e3bbcee5dcda1291b8b5643c02398d21fc
--- a/skills/component-audio-player/SKILL.md
+++ b/skills/component-audio-player/SKILL.md
@ -0,0 +1,39 @@
 ---
 name: component-audio-player
 slug: component-audio-player
 version: 1.0.0
 description: Export user audio recordings from learning components and generate self-contained HTML playback pages with inline audio players.
 metadata:
  openclaw:
    requires:
      bins: ["python3", "curl", "jq"]
    categories: ["feishu", "data-export", "audio"]
 ---
 # 组件音频导出播放器
 导出指定互动组件的用户录音，生成可播放 HTML 页面，每条记录包含参考文本、用户朗读文本、发音评分和 ▶ 播放按钮。
 ## When to Use
 用户要求导出某组件的 Oops/Perfect 等录音记录，并希望直接听到用户声音时触发。
 ## Core Rules
 1. **先跑导出脚本**：`scripts/export_component_records.sh --c-type <type> --c-id <id>` 获取全量数据
 2. **筛选目标记录**：按 `play_result` 筛选，按 `updated_at` 倒序取 Top N
 3. **音频不下载**：HTML 中 `<audio>` 直接引用 CDN URL（`static.valavala.com`），省去下载步骤
 4. **自包含 HTML**：单文件，浏览器打开即播，无需服务器
 5. **同时发 Excel**：方便用户做进一步数据分析
 6. **发送用 Bot**：通过 `lark-send-message-as-bot` 技能将 HTML + Excel 发给请求用户
 ## Quick Reference
 | 内容 | 文件 |
 |------|------|
 | 详细步骤 | `workflow.md` |
 ## Data Storage
 - 中间产物（Excel、HTML）写入 `output/` 目录
 - 不创建持久化音频缓存
--- a/skills/component-audio-player/workflow.md
+++ b/skills/component-audio-player/workflow.md
@ -0,0 +1,146 @@
 # 组件音频导出播放器 — 详细步骤
 ## 步骤 ①：解析参数
 从用户消息中提取以下参数：
 | 参数 | 说明 | 示例 |
 |------|------|------|
 | `c_type` | 组件类型 | `mid_dialog_repeat` |
 | `c_id` | 组件 ID | `1101301` |
 | `play_result` | 筛选判定（默认 Oops） | `Oops` / `Perfect` / `Pass` / `Failed` |
 | `limit` | 取最近 N 条（默认 50） | `50` |
 若用户未明确 c_type，可通过组件 ID 在 MySQL `middle_interaction_component` 或 `core_interaction_component` 表中反查。
 ## 步骤 ②：导出全量数据
 运行导出脚本：
 ```bash
 cd /root/.openclaw/workspace-xiaoban
 bash scripts/export_component_records.sh --c-type <c_type> --c-id <c_id>
 ```
 输出：`output/组件_<名称>_<c_id>_导出时间_<YYYYMMDD>.xlsx`
 包含字段：`user_id`、`session_id`、`component_unique_code`、`c_type`、`c_id`、组件名称、组件标题、`mode`、参考文本、`play_result`、发音评分、音频URL、朗读内容、`user_behavior_info`、`updated_at`
 ## 步骤 ③：筛选目标记录
 ```python
 import pandas as pd
 df = pd.read_excel('<步骤②输出的Excel>')
 target = df[df['play_result'] == '<play_result>'] \
    .sort_values('updated_at', ascending=False) \
    .head(<limit>)
 target.to_excel('output/<筛选后Excel>', index=False)
 ```
 ## 步骤 ④：提取音频信息
 音频 URL 和朗读内容已由导出脚本从 `user_behavior_info` JSON 中提取，直接使用 `音频URL`、`朗读内容`、`发音评分` 列即可。
 ## 步骤 ⑤：生成 HTML 播放页
 生成自包含 HTML，每条记录一个卡片：
 - 卡片头部：序号、user_id、时间
 - 参考文本（绿色）
 - 用户朗读文本（红色）
 - 发音评分（如有）
 - `<audio>` 播放器，`src` 直接使用 CDN URL
 ```python
 html = '''<!DOCTYPE html>
 <html lang="zh-CN">
 <head>
 <meta charset="UTF-8">
 <meta name="viewport" content="width=device-width, initial-scale=1.0">
 <title>组件音频 - <c_type> <c_id></title>
 <style>
 * { margin: 0; padding: 0; box-sizing: border-box; }
 body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; background: #f5f5f5; padding: 20px; }
 h1 { text-align: center; margin-bottom: 10px; color: #333; }
 .subtitle { text-align: center; color: #666; margin-bottom: 20px; font-size: 14px; }
 .card { background: white; border-radius: 8px; padding: 16px; margin-bottom: 12px; box-shadow: 0 1px 3px rgba(0,0,0,0.1); }
 .card-header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 8px; flex-wrap: wrap; gap: 4px; }
 .card-num { font-weight: bold; color: #e74c3c; font-size: 14px; }
 .card-user { color: #666; font-size: 13px; }
 .card-time { color: #999; font-size: 12px; }
 .card-body { display: flex; flex-direction: column; gap: 6px; }
 .card-row { display: flex; gap: 10px; align-items: center; flex-wrap: wrap; }
 .label { font-size: 12px; color: #999; min-width: 60px; }
 .value { font-size: 14px; color: #333; }
 .ref { color: #27ae60; font-weight: 500; }
 .actual { color: #e74c3c; }
 audio { width: 100%; max-width: 400px; height: 36px; margin-top: 4px; }
 .no-audio { color: #999; font-style: italic; font-size: 13px; }
 .score { background: #f0f0f0; padding: 2px 8px; border-radius: 4px; font-size: 12px; }
 </style>
 </head>
 <body>
 <h1>🎙️ <组件名称> <c_id> — <play_result> 记录</h1>
 <p class="subtitle">组件: <组件标题> | 参考文本: "<参考文本>" | 最近<limit>条 · 时间倒序 | 点击 ▶ 直接听</p>
 '''
 for idx, (_, row) in enumerate(df.iterrows()):
    audio_url = row['音频URL'] if pd.notna(row['音频URL']) else ''
    content = row['朗读内容'] if pd.notna(row['朗读内容']) else ''
    ref = row['参考文本'] if pd.notna(row['参考文本']) else ''
    score = row['发音评分'] if pd.notna(row['发音评分']) else ''
    ts = str(row['updated_at'])[:19]
    html += f'''
 <div class="card">
  <div class="card-header">
    <span class="card-num">#{idx+1} {play_result}</span>
    <span class="card-user">user_id: {row['user_id']}</span>
    <span class="card-time">{ts}</span>
  </div>
  <div class="card-body">
    <div class="card-row">
      <span class="label">参考文本:</span>
      <span class="value ref">{ref}</span>
    </div>
    <div class="card-row">
      <span class="label">用户朗读:</span>
      <span class="value actual">{content or '(无文本)'}</span>
      {f'<span class="score">发音评分: {score}</span>' if score else ''}
    </div>'''
    if audio_url:
        html += f'''
    <audio controls preload="none">
      <source src="{audio_url}" type="audio/wav">
    </audio>'''
    else:
        html += '<div class="no-audio">⚠️ 无音频文件</div>'
    html += '\n  </div>\n</div>'
 html += '\n</body>\n</html>'
 with open('output/<输出文件名>.html', 'w', encoding='utf-8') as f:
    f.write(html)
 ```
 ## 步骤 ⑥：发送文件
 使用 `lark-send-message-as-bot` 技能，将 HTML 和 Excel 分别发送给请求用户：
 1. 获取 `tenant_access_token`
 2. 上传 HTML 文件（`file_type=stream`）
 3. 发送 HTML 文件消息（`receive_id_type=user_id`）
 4. 上传 Excel 文件（`file_type=xls`）
 5. 发送 Excel 文件消息
 目标 `user_id` 通过 `lark-identify-sender` 技能获取。
 ## 依赖
 | 依赖 | 说明 |
 |------|------|
 | `scripts/export_component_records.sh` | 组件数据导出脚本 |
 | `lark-send-message-as-bot` | 飞书 Bot 消息发送 |
 | `lark-identify-sender` | 飞书用户身份识别 |
 | Python `pandas`、`openpyxl` | 数据处理和 Excel 读写 |