auto backup 2026-06-25 08:10:30
This commit is contained in:
parent
fe2c100193
commit
c81e1b532f
58
business_knowledge/README.md
Normal file
58
business_knowledge/README.md
Normal file
@ -0,0 +1,58 @@
|
||||
# 数据知识库索引
|
||||
|
||||
> 公司数据结构文档,用于支撑数据分析、业务查询、报表生成等工作。
|
||||
> 所有凭证存储在 `~/.hermes/.env`,本文档仅含脱敏引用。
|
||||
|
||||
## 数据基础设施总览
|
||||
|
||||
| 类型 | 环境 | 凭证前缀 | 用途 | 状态 |
|
||||
|------|------|----------|------|------|
|
||||
| MySQL 8.0 | 线上 | `VALA_MYSQL_ONLINE_*` | 用户/订单/配置 | ✅ |
|
||||
| MySQL 8.0 | 测试 | `VALA_MYSQL_TEST_*` | 最新配置/开发测试 | ✅ |
|
||||
| PostgreSQL 17 | 线上 | `VALA_PG_ONLINE_*` | 用户行为数据 | ✅ |
|
||||
| PostgreSQL 17 | 测试 | `VALA_PG_TEST_*` | 测试行为数据 | ✅ |
|
||||
| Elasticsearch 7.10 | 线上 | `VALA_ES_ONLINE_*` | 服务日志 | ✅ |
|
||||
| Elasticsearch | 测试 | `VALA_ES_TEST_*` | 服务日志 | ⚠️ IP白名单限制 |
|
||||
|
||||
## 数据域
|
||||
|
||||
### MySQL — 业务数据库
|
||||
|
||||
| 库名 | 环境 | 说明 |
|
||||
|------|------|------|
|
||||
| `vala_user` | 线上/测试 | [账号 & 角色](data_dict/vala_user.md) |
|
||||
| `vala` | 线上/测试 | 业务主库(配置、内容) |
|
||||
| `vala_order` | 线上/测试 | 订单数据 |
|
||||
| `vala_gray` | 线上 | 灰度发布配置 |
|
||||
| `vala_dev` | 测试 | 开发配置 |
|
||||
| `vala_bak` | 测试 | 备份数据 |
|
||||
|
||||
### PostgreSQL — 行为数据库
|
||||
|
||||
| 库名 | 环境 | 主要表(部分) |
|
||||
|------|------|---------------|
|
||||
| `vala` | 线上 | `user_chapter_play_record_*`, `user_lesson_handbook`, `gashapon_config`, `vala_pilot_explain_*` |
|
||||
| `vala_test` | 测试 | `user_chapter_play_record_*`, `user_component_play_record_*`, `user_lesson_handbook`, `account_event_count` |
|
||||
|
||||
### Elasticsearch — 日志搜索
|
||||
|
||||
| 环境 | 版本 | 用途 |
|
||||
|------|------|------|
|
||||
| 线上 | 7.10.1 | 正式环境服务日志 |
|
||||
| 测试 | - | ⚠️ 当前机器 IP 不在白名单 |
|
||||
|
||||
## 工具脚本
|
||||
|
||||
| 脚本 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| phone_encrypt | `scripts/phone_encrypt.py` | 手机号 XXTEA 加解密、MD5 |
|
||||
|
||||
## 参考文档
|
||||
|
||||
| 文档 | 说明 |
|
||||
|------|------|
|
||||
| [references/手机号查询角色ID方法.md](references/手机号查询角色ID方法.md) | 手机号查询角色原始文档 |
|
||||
|
||||
---
|
||||
|
||||
*持续建设中。新增数据源时更新本索引。*
|
||||
87
business_knowledge/data_dict/vala_user.md
Normal file
87
business_knowledge/data_dict/vala_user.md
Normal file
@ -0,0 +1,87 @@
|
||||
# vala_user — 用户 & 角色数据字典
|
||||
|
||||
> MySQL 线上数据库,存储账号和角色信息。
|
||||
|
||||
## 连接信息
|
||||
|
||||
所有凭证存储在 `~/.hermes/.env` 中(MySQL 线上 → `VALA_MYSQL_ONLINE_*`,MySQL 测试 → `VALA_MYSQL_TEST_*`)。
|
||||
|
||||
## 实体关系图
|
||||
|
||||
```
|
||||
vala_app_account (账号) vala_app_character (角色)
|
||||
┌─────────────────────┐ ┌──────────────────────────┐
|
||||
│ id (PK) │◄─────────│ account_id (FK) │
|
||||
│ tel │ 1:N │ id (PK) │
|
||||
│ tel_encrypt │ │ nickname │
|
||||
└─────────────────────┘ │ gender │
|
||||
│ birthday │
|
||||
│ purchase_season_package │
|
||||
│ created_at │
|
||||
└──────────────────────────┘
|
||||
```
|
||||
|
||||
- 一个账号可以有多个角色(例如:一个孩子一个角色)
|
||||
- 关联字段:`vala_app_character.account_id = vala_app_account.id`
|
||||
|
||||
## 表结构
|
||||
|
||||
### vala_app_account(账号表)
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `id` | bigint | 账号ID(主键) |
|
||||
| `tel` | varchar(20) | 手机号(脱敏显示,如 `158****7007`) |
|
||||
| `tel_encrypt` | varchar(100) | 手机号密文(XXTEA + Base64 URL-safe) |
|
||||
|
||||
### vala_app_character(角色表)
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `id` | bigint | 角色ID(主键) |
|
||||
| `account_id` | bigint | 所属账号ID(FK → vala_app_account.id) |
|
||||
| `nickname` | varchar(20) | 角色昵称 |
|
||||
| `gender` | tinyint(1) | 性别 |
|
||||
| `birthday` | varchar(50) | 生日 |
|
||||
| `purchase_season_package` | text | 已购赛季包 |
|
||||
| `created_at` | datetime | 创建时间 |
|
||||
|
||||
## 手机号加密
|
||||
|
||||
- **算法**: XXTEA
|
||||
- **密钥**: 存储在 `~/.hermes/.env` → `VALA_PHONE_XXTEA_KEY`
|
||||
- **编码**: Base64 URL-safe(`+`→`-`, `/`→`_`, `=`→`.`)
|
||||
- **工具脚本**: `business_knowledge/scripts/phone_encrypt.py`
|
||||
|
||||
### 加密流程
|
||||
|
||||
```
|
||||
明文手机号 → XXTEA加密 → Base64 → URL-safe替换 → tel_encrypt 密文
|
||||
```
|
||||
|
||||
### 查询流程
|
||||
|
||||
1. 用 `phone_encrypt.py` 将手机号加密为密文
|
||||
2. 用密文在 `vala_app_account.tel_encrypt` 精确匹配
|
||||
3. JOIN `vala_app_character` 获取角色列表
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
a.id AS account_id,
|
||||
a.tel,
|
||||
c.id AS character_id,
|
||||
c.nickname,
|
||||
c.gender,
|
||||
c.birthday,
|
||||
c.purchase_season_package
|
||||
FROM vala_app_account a
|
||||
LEFT JOIN vala_app_character c ON c.account_id = a.id
|
||||
WHERE a.tel_encrypt = '<密文>';
|
||||
```
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **tel 字段是脱敏的**(如 `158****7007`),不能用于精确匹配
|
||||
2. **必须用 tel_encrypt 密文匹配**
|
||||
3. **一个账号可以有多个角色**,查询结果可能返回多行
|
||||
4. 测试环境和线上环境的 `tel_encrypt` 值相同(加密算法一致)
|
||||
100
business_knowledge/references/手机号查询角色ID方法.md
Normal file
100
business_knowledge/references/手机号查询角色ID方法.md
Normal file
@ -0,0 +1,100 @@
|
||||
# 手机号 → 账号ID → 角色ID 检索方法
|
||||
|
||||
> ⚠️ 本文为脱敏参考版。凭证已移至 `~/.hermes/.env`,数据库字典见 `data_dict/vala_user.md`。
|
||||
|
||||
## 数据关系
|
||||
|
||||
```
|
||||
手机号 (明文)
|
||||
│ XXTEA 加密
|
||||
▼
|
||||
tel_encrypt (密文) account_id
|
||||
│ │
|
||||
▼ ▼
|
||||
vala_app_account ──────────► vala_app_character
|
||||
(账号表) 1:N 关联 (角色表)
|
||||
```
|
||||
|
||||
- **一个账号** (`vala_app_account`) 可以有 **多个角色** (`vala_app_character`)
|
||||
- 关联字段:`vala_app_character.account_id = vala_app_account.id`
|
||||
|
||||
## 数据库
|
||||
|
||||
| 项目 | 值 |
|
||||
|------|-----|
|
||||
| 数据库 | MySQL 线上环境 |
|
||||
| 库名 | `vala_user` |
|
||||
| 用户 | `read_only` |
|
||||
|
||||
> 具体连接信息从 `~/.hermes/.env` 读取。
|
||||
|
||||
## 表结构
|
||||
|
||||
### vala_app_account(账号表)
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `id` | bigint | 账号ID(主键) |
|
||||
| `tel` | varchar(20) | 手机号(脱敏显示,如 `158****7007`) |
|
||||
| `tel_encrypt` | varchar(100) | 手机号密文(用于精确匹配) |
|
||||
|
||||
### vala_app_character(角色表)
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| `id` | bigint | 角色ID(主键) |
|
||||
| `account_id` | bigint | 所属账号ID |
|
||||
| `nickname` | varchar(20) | 角色昵称 |
|
||||
| `gender` | tinyint(1) | 性别 |
|
||||
| `birthday` | varchar(50) | 生日 |
|
||||
| `purchase_season_package` | text | 已购赛季包 |
|
||||
|
||||
## 手机号加密方式
|
||||
|
||||
手机号在数据库中以密文存储,加密算法为 **XXTEA + Base64 URL-safe**。
|
||||
|
||||
密钥从 `~/.hermes/.env` → `VALA_PHONE_XXTEA_KEY` 读取。
|
||||
|
||||
## 查询步骤
|
||||
|
||||
### 步骤 1:加密手机号
|
||||
|
||||
```bash
|
||||
python3 business_knowledge/scripts/phone_encrypt.py encrypt 15849377007
|
||||
```
|
||||
|
||||
### 步骤 2:用密文查询账号和角色
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
a.id AS account_id,
|
||||
a.tel,
|
||||
c.id AS character_id,
|
||||
c.nickname,
|
||||
c.gender,
|
||||
c.birthday,
|
||||
c.purchase_season_package,
|
||||
c.created_at
|
||||
FROM vala_app_account a
|
||||
LEFT JOIN vala_app_character c ON c.account_id = a.id
|
||||
WHERE a.tel_encrypt = '<密文>';
|
||||
```
|
||||
|
||||
### 步骤 3:解读结果
|
||||
|
||||
```
|
||||
account_id tel character_id nickname gender birthday purchase_season_package
|
||||
18279 158****7007 23600 Morris 1 2021-09-09 [16,17,18,19,20]
|
||||
18279 158****7007 23686 Nathan 1 2018-03-13 [16]
|
||||
```
|
||||
|
||||
- **账号ID**: 18279
|
||||
- **角色**: 23600 (Morris)、23686 (Nathan)
|
||||
- 一个账号下可能有多个角色(一个孩子一个角色)
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **tel 字段是脱敏的**(如 `158****7007`),不能直接用于精确匹配
|
||||
2. **必须用 tel_encrypt 密文匹配**,密文由 XXTEA 加密生成
|
||||
3. **一个账号可以有多个角色**,查询结果可能返回多行
|
||||
4. 测试环境和线上环境的 `tel_encrypt` 值相同(加密算法一致)
|
||||
Binary file not shown.
84
business_knowledge/scripts/phone_encrypt.py
Normal file
84
business_knowledge/scripts/phone_encrypt.py
Normal file
@ -0,0 +1,84 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
手机号加解密工具 — 从 ~/.hermes/.env 读取 XXTEA 密钥。
|
||||
|
||||
用法:
|
||||
from phone_encrypt import encrypt_phone, decrypt_phone, phone_md5
|
||||
|
||||
cipher = encrypt_phone("13800138000")
|
||||
phone = decrypt_phone(cipher)
|
||||
md5 = phone_md5("13800138000")
|
||||
|
||||
命令行:
|
||||
python phone_encrypt.py encrypt 13800138000
|
||||
python phone_encrypt.py decrypt CxMOc6z56aYjE73r8OSAog..
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
import hashlib
|
||||
import base64
|
||||
|
||||
try:
|
||||
import xxtea
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"请先安装 xxtea: pip install xxtea-py"
|
||||
)
|
||||
|
||||
|
||||
def _load_key() -> str:
|
||||
"""从 ~/.hermes/.env 加载 XXTEA 密钥"""
|
||||
env_path = os.path.expanduser("~/.hermes/.env")
|
||||
if not os.path.exists(env_path):
|
||||
raise FileNotFoundError(f"找不到 .env 文件: {env_path}")
|
||||
|
||||
with open(env_path, "r") as f:
|
||||
content = f.read()
|
||||
|
||||
match = re.search(r"VALA_PHONE_XXTEA_KEY=(.+)", content)
|
||||
if not match:
|
||||
raise ValueError("在 .env 中未找到 VALA_PHONE_XXTEA_KEY")
|
||||
|
||||
return match.group(1).strip().strip('"')
|
||||
|
||||
|
||||
KEY = _load_key()
|
||||
|
||||
|
||||
def encrypt_phone(phone: str) -> str:
|
||||
"""加密明文手机号,返回与数据库 tel_encrypt 字段一致的密文"""
|
||||
encrypted = xxtea.encrypt(phone.encode(), KEY.encode())
|
||||
result = base64.b64encode(encrypted).decode()
|
||||
result = result.replace("+", "-").replace("/", "_").replace("=", ".")
|
||||
return result
|
||||
|
||||
|
||||
def decrypt_phone(encrypted: str) -> str:
|
||||
"""解密 tel_encrypt 还原明文手机号"""
|
||||
restored = encrypted.replace("-", "+").replace("_", "/").replace(".", "=")
|
||||
decrypted = xxtea.decrypt(base64.b64decode(restored), KEY.encode())
|
||||
return decrypted.decode()
|
||||
|
||||
|
||||
def phone_md5(phone: str) -> str:
|
||||
"""手机号 MD5(用于跨系统关联)"""
|
||||
return hashlib.md5(phone.encode()).hexdigest()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
if len(sys.argv) < 3:
|
||||
print("用法: python phone_encrypt.py <encrypt|decrypt> <手机号|密文>")
|
||||
sys.exit(1)
|
||||
|
||||
action, value = sys.argv[1], sys.argv[2]
|
||||
|
||||
if action == "encrypt":
|
||||
print(encrypt_phone(value))
|
||||
elif action == "decrypt":
|
||||
print(decrypt_phone(value))
|
||||
else:
|
||||
print(f"未知操作: {action},支持 encrypt / decrypt")
|
||||
sys.exit(1)
|
||||
BIN
output/知识巩固_音频_23600_23686_20260624_161846.xlsx
Normal file
BIN
output/知识巩固_音频_23600_23686_20260624_161846.xlsx
Normal file
Binary file not shown.
BIN
output/知识巩固_题目详情_23600_23686_20260624_162717.xlsx
Normal file
BIN
output/知识巩固_题目详情_23600_23686_20260624_162717.xlsx
Normal file
Binary file not shown.
BIN
output/知识巩固_题目详情_23600_23686_20260624_163135.xlsx
Normal file
BIN
output/知识巩固_题目详情_23600_23686_20260624_163135.xlsx
Normal file
Binary file not shown.
BIN
output/知识巩固_题目详情_23600_23686_20260624_163337.xlsx
Normal file
BIN
output/知识巩固_题目详情_23600_23686_20260624_163337.xlsx
Normal file
Binary file not shown.
289
scripts/export_review_audio.py
Normal file
289
scripts/export_review_audio.py
Normal file
@ -0,0 +1,289 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
导出指定角色的课程巩固数据 + 原始音频。
|
||||
用法: python3 export_review_audio.py <角色ID1> [角色ID2] ...
|
||||
python3 export_review_audio.py 23600 23686
|
||||
"""
|
||||
import re, json, sys, os, subprocess
|
||||
from datetime import datetime
|
||||
|
||||
# ── 加载 .env ───────────────────────────────────────
|
||||
def load_env():
|
||||
env_path = os.path.expanduser("~/.hermes/.env")
|
||||
with open(env_path) as f:
|
||||
content = f.read()
|
||||
def g(k):
|
||||
m = re.search(rf"{k}=(.+)", content)
|
||||
return m.group(1).strip() if m else None
|
||||
return g
|
||||
|
||||
g = load_env()
|
||||
|
||||
# ── 参数 ────────────────────────────────────────────
|
||||
if len(sys.argv) < 2:
|
||||
print("用法: python3 export_review_audio.py <角色ID1> [角色ID2] ...")
|
||||
sys.exit(1)
|
||||
|
||||
user_ids = [int(x) for x in sys.argv[1:]]
|
||||
output_dir = os.path.expanduser("~/.hermes/workspace/output")
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
ts = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
uid_str = "_".join(str(u) for u in user_ids)
|
||||
output_path = f"{output_dir}/知识巩固_音频_{uid_str}_{ts}.xlsx"
|
||||
|
||||
print(f"导出角色: {user_ids}")
|
||||
print(f"输出文件: {output_path}")
|
||||
|
||||
# ── 1. 查询 PG: 课程巩固记录 ───────────────────────
|
||||
print("\n[1/3] 查询 PostgreSQL 课程巩固记录...")
|
||||
import psycopg2
|
||||
from psycopg2.extras import RealDictCursor
|
||||
|
||||
pg_conn = psycopg2.connect(
|
||||
host=g("VALA_PG_ONLINE_HOST"), port=int(g("VALA_PG_ONLINE_PORT")),
|
||||
user=g("VALA_PG_ONLINE_USER"), password=g("VALA_PG_ONLINE_PASSWORD"),
|
||||
dbname=g("VALA_PG_ONLINE_DB"), connect_timeout=10,
|
||||
)
|
||||
|
||||
with pg_conn.cursor(cursor_factory=RealDictCursor) as cur:
|
||||
cur.execute("""
|
||||
SELECT user_id, story_id, chapter_id, unique_id,
|
||||
score, score_text, sp_value, exp, level,
|
||||
question_list, play_time, created_at, updated_at
|
||||
FROM user_unit_review_question_result
|
||||
WHERE user_id = ANY(%s) AND deleted_at IS NULL
|
||||
ORDER BY user_id, updated_at DESC
|
||||
""", (user_ids,))
|
||||
review_rows = cur.fetchall()
|
||||
|
||||
# Parse question_list JSON for readable summary
|
||||
for row in review_rows:
|
||||
ql = row["question_list"]
|
||||
if isinstance(ql, str):
|
||||
try:
|
||||
ql = json.loads(ql)
|
||||
except:
|
||||
pass
|
||||
questions = []
|
||||
if isinstance(ql, list):
|
||||
for item in ql:
|
||||
if isinstance(item, dict):
|
||||
q = item.get("question", {})
|
||||
qtype = q.get("type", "")
|
||||
qtitle = q.get("title", "")
|
||||
user_answer = item.get("userAnswer", "")
|
||||
score = item.get("score", "")
|
||||
questions.append(f"[{qtype}] {qtitle} | 回答: {user_answer} | 得分: {score}")
|
||||
row["question_summary"] = "\n".join(questions)
|
||||
row["question_count"] = len(ql) if isinstance(ql, list) else 0
|
||||
|
||||
pg_conn.close()
|
||||
print(f" → 查询到 {len(review_rows)} 条课程巩固记录")
|
||||
|
||||
# ── 2. 查询 ES: 音频数据 ────────────────────────────
|
||||
print("\n[2/3] 查询 Elasticsearch 音频数据...")
|
||||
es_url = f"{g('VALA_ES_ONLINE_SCHEME')}://{g('VALA_ES_ONLINE_HOST')}:{g('VALA_ES_ONLINE_PORT')}"
|
||||
auth = f"{g('VALA_ES_ONLINE_USER')}:{g('VALA_ES_ONLINE_PASSWORD')}"
|
||||
|
||||
audio_rows = []
|
||||
scroll_id = None
|
||||
page_size = 500
|
||||
|
||||
# First page
|
||||
query = {
|
||||
"query": {"terms": {"userId": user_ids}},
|
||||
"sort": [{"timeInt": {"order": "desc"}}],
|
||||
"size": page_size,
|
||||
}
|
||||
r = subprocess.run([
|
||||
"curl", "-sk", "-u", auth,
|
||||
"-H", "Content-Type: application/json",
|
||||
"--connect-timeout", "10", "--max-time", "30",
|
||||
"-X", "POST", "-d", json.dumps(query),
|
||||
f"{es_url}/user-audio/_search?scroll=2m"
|
||||
], capture_output=True, text=True, timeout=35)
|
||||
resp = json.loads(r.stdout)
|
||||
scroll_id = resp.get("_scroll_id")
|
||||
total = resp.get("hits", {}).get("total", {}).get("value", 0)
|
||||
print(f" → ES 总计 {total} 条音频记录,分批读取...")
|
||||
|
||||
hits = resp.get("hits", {}).get("hits", [])
|
||||
for h in hits:
|
||||
audio_rows.append(h["_source"])
|
||||
|
||||
# Scroll remaining
|
||||
batch = 1
|
||||
while len(audio_rows) < total:
|
||||
r = subprocess.run([
|
||||
"curl", "-sk", "-u", auth,
|
||||
"-H", "Content-Type: application/json",
|
||||
"--connect-timeout", "10", "--max-time", "30",
|
||||
"-X", "POST", "-d", json.dumps({"scroll": "2m", "scroll_id": scroll_id}),
|
||||
f"{es_url}/_search/scroll"
|
||||
], capture_output=True, text=True, timeout=35)
|
||||
resp = json.loads(r.stdout)
|
||||
scroll_id = resp.get("_scroll_id")
|
||||
hits = resp.get("hits", {}).get("hits", [])
|
||||
if not hits:
|
||||
break
|
||||
for h in hits:
|
||||
audio_rows.append(h["_source"])
|
||||
batch += 1
|
||||
print(f" → 批次 {batch}: 已读 {len(audio_rows)}/{total} 条")
|
||||
|
||||
# Clean up scroll
|
||||
subprocess.run([
|
||||
"curl", "-sk", "-u", auth, "--connect-timeout", "5",
|
||||
"-X", "DELETE", "-d", json.dumps({"scroll_id": scroll_id}),
|
||||
f"{es_url}/_search/scroll"
|
||||
], capture_output=True, timeout=10)
|
||||
|
||||
print(f" → 共读取 {len(audio_rows)} 条音频记录")
|
||||
|
||||
# ── 3. 导出 Excel ────────────────────────────────────
|
||||
print("\n[3/3] 生成 Excel...")
|
||||
import pandas as pd
|
||||
from openpyxl import Workbook
|
||||
from openpyxl.utils.dataframe import dataframe_to_rows
|
||||
from openpyxl.styles import Font, Alignment, PatternFill
|
||||
|
||||
wb = Workbook()
|
||||
|
||||
# Sheet 1: 课程巩固记录
|
||||
ws1 = wb.active
|
||||
ws1.title = "课程巩固记录"
|
||||
review_data = []
|
||||
for row in review_rows:
|
||||
review_data.append({
|
||||
"角色ID": row["user_id"],
|
||||
"Level": row["level"],
|
||||
"Story ID": row["story_id"],
|
||||
"Chapter ID": row["chapter_id"],
|
||||
"Unique ID": row["unique_id"],
|
||||
"得分": row["score"],
|
||||
"评级": row["score_text"],
|
||||
"SP值": row["sp_value"],
|
||||
"经验值": row["exp"],
|
||||
"题目数": row["question_count"],
|
||||
"耗时(秒)": row["play_time"],
|
||||
"题目详情": row["question_summary"],
|
||||
"更新时间": str(row["updated_at"]),
|
||||
"创建时间": str(row["created_at"]),
|
||||
})
|
||||
|
||||
df1 = pd.DataFrame(review_data)
|
||||
for r_idx, row in enumerate(dataframe_to_rows(df1, index=False, header=True), 1):
|
||||
for c_idx, value in enumerate(row, 1):
|
||||
ws1.cell(row=r_idx, column=c_idx, value=value)
|
||||
|
||||
# Style header
|
||||
header_font = Font(bold=True, color="FFFFFF")
|
||||
header_fill = PatternFill(start_color="4472C4", end_color="4472C4", fill_type="solid")
|
||||
for cell in ws1[1]:
|
||||
cell.font = header_font
|
||||
cell.fill = header_fill
|
||||
cell.alignment = Alignment(horizontal="center")
|
||||
|
||||
# Column widths
|
||||
ws1.column_dimensions["A"].width = 10
|
||||
ws1.column_dimensions["L"].width = 12
|
||||
ws1.column_dimensions["M"].width = 60
|
||||
|
||||
# Sheet 2: 音频数据
|
||||
ws2 = wb.create_sheet("音频数据")
|
||||
audio_data = []
|
||||
for a in audio_rows:
|
||||
# Extract makee_id from userMsg if present
|
||||
makee_id = ""
|
||||
user_msg = a.get("userMsg", "")
|
||||
if isinstance(user_msg, str) and "makee_id" in user_msg:
|
||||
try:
|
||||
um = json.loads(user_msg)
|
||||
makee_id = um.get("makee_id", "")
|
||||
except:
|
||||
pass
|
||||
|
||||
audio_data.append({
|
||||
"角色ID": a.get("userId"),
|
||||
"角色名": a.get("userName"),
|
||||
"Session ID": a.get("sessionId"),
|
||||
"组件ID": a.get("componentId"),
|
||||
"组件类型": a.get("componentType"),
|
||||
"音频URL": a.get("audioUrl"),
|
||||
"LLM音频URL": a.get("llmAudioUrl"),
|
||||
"ASR状态": a.get("asrStatus"),
|
||||
"发音评分(SOE)": json.dumps(a.get("soeData")) if a.get("soeData") else "",
|
||||
"第几轮": a.get("roundNum"),
|
||||
"Makee ID": makee_id,
|
||||
"时间": a.get("timeStr"),
|
||||
"时间戳": a.get("timeInt"),
|
||||
"数据版本": a.get("dataVersion"),
|
||||
})
|
||||
|
||||
df2 = pd.DataFrame(audio_data)
|
||||
for r_idx, row in enumerate(dataframe_to_rows(df2, index=False, header=True), 1):
|
||||
for c_idx, value in enumerate(row, 1):
|
||||
ws2.cell(row=r_idx, column=c_idx, value=value)
|
||||
|
||||
for cell in ws2[1]:
|
||||
cell.font = header_font
|
||||
cell.fill = header_fill
|
||||
cell.alignment = Alignment(horizontal="center")
|
||||
|
||||
ws2.column_dimensions["G"].width = 50
|
||||
ws2.column_dimensions["H"].width = 50
|
||||
ws2.column_dimensions["I"].width = 15
|
||||
ws2.column_dimensions["K"].width = 40
|
||||
ws2.column_dimensions["M"].width = 22
|
||||
|
||||
# Sheet 3: 汇总
|
||||
ws3 = wb.create_sheet("汇总")
|
||||
ws3["A1"] = "导出信息"
|
||||
ws3["A1"].font = Font(bold=True, size=14)
|
||||
ws3["A3"] = "导出时间"
|
||||
ws3["B3"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
ws3["A4"] = "角色ID"
|
||||
ws3["B4"] = ", ".join(str(u) for u in user_ids)
|
||||
ws3["A5"] = "课程巩固记录数"
|
||||
ws3["B5"] = len(review_rows)
|
||||
ws3["A6"] = "音频记录数"
|
||||
ws3["B6"] = len(audio_rows)
|
||||
|
||||
# Per-user breakdown
|
||||
row_offset = 8
|
||||
ws3[f"A{row_offset}"] = "按角色统计"
|
||||
ws3[f"A{row_offset}"].font = Font(bold=True)
|
||||
row_offset += 1
|
||||
ws3[f"A{row_offset}"] = "角色ID"
|
||||
ws3[f"B{row_offset}"] = "巩固记录"
|
||||
ws3[f"C{row_offset}"] = "音频记录"
|
||||
ws3[f"D{row_offset}"] = "最新巩固时间"
|
||||
for cell in ws3[row_offset]:
|
||||
cell.font = Font(bold=True)
|
||||
cell.fill = header_fill
|
||||
cell.font = Font(bold=True, color="FFFFFF")
|
||||
|
||||
row_offset += 1
|
||||
for uid in user_ids:
|
||||
r_cnt = sum(1 for r in review_rows if r["user_id"] == uid)
|
||||
a_cnt = sum(1 for a in audio_rows if a.get("userId") == uid)
|
||||
latest = max(
|
||||
(str(r["updated_at"]) for r in review_rows if r["user_id"] == uid),
|
||||
default="无"
|
||||
)
|
||||
ws3[f"A{row_offset}"] = uid
|
||||
ws3[f"B{row_offset}"] = r_cnt
|
||||
ws3[f"C{row_offset}"] = a_cnt
|
||||
ws3[f"D{row_offset}"] = latest
|
||||
row_offset += 1
|
||||
|
||||
ws3.column_dimensions["A"].width = 18
|
||||
ws3.column_dimensions["B"].width = 22
|
||||
ws3.column_dimensions["C"].width = 18
|
||||
ws3.column_dimensions["D"].width = 28
|
||||
|
||||
wb.save(output_path)
|
||||
print(f"\n✅ 导出完成: {output_path}")
|
||||
print(f" Sheet 1 — 课程巩固记录: {len(review_rows)} 行")
|
||||
print(f" Sheet 2 — 音频数据: {len(audio_rows)} 行")
|
||||
print(f" Sheet 3 — 汇总")
|
||||
Loading…
Reference in New Issue
Block a user