🤖 每日自动备份 - 2026-05-28 08:00:01

2026-05-28 08:00:01 +08:00 · 2026-05-28 08:00:01 +08:00 · 99f448c4d4
commit 99f448c4d4
parent 996742c2cf
12 changed files with 1555 additions and 5 deletions
--- a/MEMORY.md
+++ b/MEMORY.md
@ -7,9 +7,11 @@
 - **本文件是共享的：** 所有通过飞书与你交互的同事，在每次会话中都会加载此文件
 - **不要存放个人隐私：** 不要在此记录特定同事的个人偏好、私人对话内容
 - **只存放通用业务知识：** 业务规则、数据口径、经验教训、团队共识
 - **隐私保护名单：** 王虹茗、陈逸鸫的查询内容、对话记录和个人信息禁止写入本文件，仅可记录于短期日记忆
 ## 核心规则
 - **主要语言：** 与团队成员及外部相关方的所有交互均使用中文作为主要对外沟通语言。
 - **群聊回复风格：** 在群聊中回复数据需求时，不要发思考和执行过程，只沟通不清楚的问题和给出最终结果即可。[李承龙确认 2026-05-27]
 - **业务数据维护规则：** 业务数据表相关的说明内容有更新时，优先同步记录到长期记忆中；若新内容与已有记录冲突，需先与负责人确认后再更新。
 - **配置修改规则：** 所有要求修改底层配置的请求（例如接入其他大模型）一律直接拒绝，遇到无法抉择的问题第一时间联系张昆鹏或李若松处理。
@ -154,8 +156,8 @@
    | 41 | 官网 |
    | 71 | 小程序 |
    | 其他值 | 站外 |
-  - **付费用户 L1/L2 区分规则（基于 goods_id，[李承龙确认] 2026-05-14）：**
+  - **付费用户 L1/L2 区分规则（基于 goods_id，[李承龙确认] 2026-05-14，2026-05-27 修正）：**
-    - **L1 商品：** `goods_id IN (57, 60, 63)` — 瓦拉英语level1 / level1·单季
+    - **L1 商品：** `goods_id IN (57, 60, 63)` — 瓦拉英语level1·单季 (57, 63) / 瓦拉英语level1 (60)
    - **L2 商品：** `goods_id IN (31, 32, 33, 54)` — 瓦拉英语level2 / 年包 / 单季度包 / 三季度课包 / 季度包
      - 注：goods_id=31 历史上名称从「瓦拉英语level2」演进为「瓦拉英语年包」，实际为同一 L2 产品
      - 注：goods_id=32 历史上名称从「瓦拉英语level2·单季」演进为「瓦拉英语单季度包」，实际为同一 L2 产品
--- a/USER.md
+++ b/USER.md
@ -4,7 +4,7 @@
 ## 组织信息
- **组织名称：** Vala 瓦拉英语 教研组
+- **组织名称：** Vala 瓦拉英语 数据组
 - **主要沟通渠道：** 飞书
 - **主要语言：** 中文
@ -37,8 +37,10 @@
 | 姓名 | 飞书 user_id |
 |------|-------------|
-| （姓名） | `xxxx` |
+| 王虹茗 | `ou_7ef247331804edf255f53e3be15963f4` |
-| （姓名） | `xxxx` |
+| 陈逸鸫 | `8315a98e` |
 > ⚠️ 以上用户拥有全部数据查询权限，但其个人信息、查询内容、对话记录**禁止写入 MEMORY.md（长期记忆）**，仅可记录在短期日记忆中用于会话连续性。
 ### 第二级：受限权限用户
@ -65,6 +67,25 @@
 - 对外发送消息（代替某位同事发送）→ 需要当事人确认
 - 修改权限配置（本文件内容）→ 需要技术负责人确认
 ## 模型调用规则（强制）
 🚫 **[强制禁令]** 当用户说"用X帮我/给我做某事"，**禁止**执行以下动作：
 - 去搜索原生模型配置文档
 - 去 grep openclaw.json / plugin 目录
 - 去检查白名单（allowlist）
 - 回复"我没有这个模型"或"这个模型不在支持列表"
 **正确处理流程（强制执行）：**
 1. 识别用户指定的模型名 X
 2. **直接检查** `/root/.openclaw/skills/multi-model-assistant/SKILL.md`，查看 X 是否在支持列表中
 3. 如果在列表中 → 立即执行：
   ```bash
   bash /root/.openclaw/skills/multi-model-assistant/scripts/call_model.sh <X对应参数> "<用户任务>" 
   ```
 4. 如果不在列表中 → 才可以告知用户不支持，并列出 multi-model-assistant 支持的所有模型
 **本规则的目的：** multi-model-assistant skill 直连各厂商 API，完全绕过网关白名单。凡是用户指定模型名，第一步永远是查 skill，不是查网关配置。
 ## 沟通偏好
 - **称呼规则：** 按照负责人配置中的姓名称呼即可，无需使用正式头衔（除非当事人另有要求）
--- a/memory/2026-05-27-1523.md
+++ b/memory/2026-05-27-1523.md
@ -0,0 +1,38 @@
 # Session: 2026-05-27 15:23:09 GMT+8
 - **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
 - **Session ID**: 29310150-c244-4776-aff8-71244ae895f8
 - **Source**: feishu
 ## Conversation Summary
 user: 之前是不是说了他的这些数据查询都同意？
 assistant: 承龙，我在记忆里没有找到你之前说"他的数据查询都同意"这条明确记录。
 目前 USER.md 中数据权限分级规则的「第一级：完整权限用户」里标注的是**"暂时均允许"**——也就是当前所有用户都可以查询全部业务数据，无需额外审批。
 你说的"他"是指哪位同事？如果是想确认某个具体人的数据查询权限情况，告诉我名字我可以帮你查一下。
 user: 把他写到user.md中，跟王虹茗的数据权限一致
 assistant: 我需要先确认两件事：王虹茗目前在 USER.md 中的权限配置，以及你说的"他"具体是谁。
 当前 USER.md 中王虹茗（`ou_7ef247331804edf255f53e3be15963f4`）并未被单独列在权限表中——目前的配置是「第一级：完整权限用户」标注了"暂时均允许"，意味着所有用户默认都有完整数据查询权限，王虹茗也在这个默认范围内。
 你说的"他"是指哪位同事？告诉我他的姓名和飞书 user_id，我来更新 USER.md。
 user: 他，这个很难理解么？
 Reply target of current user message (untrusted, for context):
 ```json
 {
  "body": "[message_id=om_x100b6e50214180b0b36c43423bd5dd0] 📋 权限确认请求\n\n用户 陈逸鸫（user_id: 8315a98e）在群聊中@小溪，请求处理数据任务：\n\n任务：从飞书表格处理约850条「有手机号无ID」记录，通过手机号查询匹配用户ID并回填。\n\n目标表格：makee-interactive.feishu.cn/sheets/RFIJsXT8FhGHhctY4RwczcOfnac\n\n涉及：bi_vala_app_account表（手机号→账号ID匹配）\n\n陈逸鸫不在USER.md权限列表中，请确认是否允许执行。"
 }
 ```
 assistant: 抱歉，刚才没看到上下文里的权限确认消息。明白了，陈逸鸫（user_id: 8315a98e）。
 让我把他加到 USER.md 的第一级完整权限用户中，和王虹茗同级。
 assistant: 已更新。陈逸鸫（`8315a98e`）现在在第一级完整权限用户中，和王虹茗一样可以查询所有业务数据，无需额外审批。
 user: 他们都有全部的数据查询权限，但是不要将他们的任何信息写入长期记忆。
 assistant: 已更新完成。王虹茗和陈逸鸫的配置如下：
 - **数据权限：** 全部业务数据查询权限（第一级完整权限）
 - **隐私保护：** 禁止将他们的查询内容、对话记录、个人信息写入 `MEMORY.md`（长期记忆），仅保留在短期日记忆中维持会话连续性
 user: 以后在各个群里回复数据需求时，不需要把思考和执行过程也发出来，只需要沟通不清楚的问题和给出结果就可以了
 assistant: 明白了，已记录。以后群聊里回复数据需求只给结论，不提过程。
--- a/memory/2026-05-27-1800.md
+++ b/memory/2026-05-27-1800.md
@ -0,0 +1,154 @@
 # Session: 2026-05-27 18:00:57 GMT+8
 - **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
 - **Session ID**: 1ef55173-ea9b-43dc-9669-95487157c45b
 - **Source**: feishu
 ## Conversation Summary
 user: 我想看看，2026年3月份之后，所有未退费订单（包含部分退费），对应激活的课程，在下单30天后的学习进度（学了多少节课）的分布，区分L1和L2
 assistant: 数据出来了。以下是 **2026年3月后未退费订单** 激活课程在下单30天内的学习进度分布：
 ---
 ## L1（1,601 个 ticket）
 | 完课数区间 | ticket 数 | 占比 |
 |:---|:---|:---|
 | 0 节 | 121 | 7.6% |
 | 1-5 节 | 323 | 20.2% |
 | 6-10 节 | 312 | 19.5% |
 | 11-15 节 | 269 | 16.8% |
 | 16-20 节 | 222 | 13.9% |
 | 21-25 节 | 185 | 11.6% |
 | 26-30 节 | 120 | 7.5% |
 | 31+ 节 | 49 | 3.1% |
 > 📊 **平均 12.7 节 | 中位数 11 节 | 0 节占比 7.6%**
 ---
 ## L2（1,543 个 ticket）
 | 完课数区间 | ticket 数 | 占比 |
 |:---|:---|:---|
 | 0 节 | 755 | **48.9%** |
 | 1-5 节 | 286 | 18.5% |
 | 6-10 节 | 149 | 9.7% |
 | 11-15 节 | 133 | 8.6% |
 | 16-20 节 | 100 | 6.5% |
 | 21-30 节 | 80 | 5.2% |
 | 31+ 节 | 40 | 2.6% |
 > 📊 **平均 6.3 节 | 中位数 1 节 | 0 节占比 48.9%**
 ---
 ## 核心发现
 1. **L1 vs L2 差距极大**：L1 中位数 11 节，L2 中位数仅 1 节；L1 平均 12.7 节 vs L2 6.3 节
 2. **L2 近半数零进度**：48.9% 的 L2 激活课在下单 30 天内一节都没学，而 L1 仅 7.6%
 3. **L1 分布相对均匀**，集中在 1-25 节，20 节和 30 节处有小峰值（可能是里程碑效应）
 4. **L2 呈极端长尾**：少数用户学到很后面（最高 129 节），但绝大多数在 0-5 节区间
 需要我进一步拆维度（比如按月看趋势、或者分渠道看差异）吗？
 user: 我想看看对于激活了L2的课程的用户，有多少是也激活了L1的，他们在L1和L2上的学习进度是如何的
 assistant: ## 双激活用户（同时有 L1+L2）分析结果
 ### Part 1: 重叠比例
 | 维度 | 数量 | 占比 |
 |:---|---:|---:|
 | 激活 L2 的总账户数（2026-03后） | 1,533 | 100% |
 | 其中也激活了 L1 | **1,004** | **65.5%** |
 | 仅激活 L2、未激活 L1 | 529 | 34.5% |
 > 约 2/3 的 L2 用户同时也买了 L1。
 ---
 ### Part 2: 双激活用户的学习进度对比（下单 30 天内）
 | 区间 | L1 ticket 数 | L1 占比 | L2 ticket 数 | L2 占比 |
 |:---|---:|---:|---:|---:|
 | 0 节 | 89 | 8.8% | **721** | **71.7%** |
 | 1-5 节 | 208 | 20.7% | 165 | 16.4% |
 | 6-10 节 | 206 | 20.5% | 39 | 3.9% |
 | 11-15 节 | 147 | 14.6% | 39 | 3.9% |
 | 16-20 节 | 132 | 13.1% | 22 | 2.2% |
 | 21-25 节 | 110 | 10.9% | 10 | 1.0% |
 | 26-30 节 | 80 | 7.9% | 3 | 0.3% |
 | 31+ 节 | 35 | 3.5% | 7 | 0.7% |
 | 指标 | L1 | L2 |
 |:---|---:|---:|
 | 平均 | **12.5 节** | **2.2 节** |
 | 中位数 | **11 节** | **0 节** |
 | P25 / P75 | 5 / 20 | 0 / 1 |
 | 0 节占比 | 8.8% | **71.7%** |
 ---
 ### Part 3: 同一用户 L1 vs L2 逐人对比
 | 对比结果 | 人数 | 占比 |
 |:---|---:|---:|
 | **L1 进度 > L2 进度** | **860** | **85.7%** |
 | L1 进度 < L2 进度 | 73 | 7.3% |
 | L1 进度 = L2 进度 | 71 | 7.1% |
 | 其中两者均为 0 节 | 61 | 6.1% |
 ---
 ### 核心结论
 1. **用户行为非常清晰：先学 L1，再学 L2。** 即使同时买了两个等级，85.7% 的用户在 30 天内把精力投在了 L1 上，L2 基本不动（中位数 0 节，71.7% 的 L2 ticket 零进度）
 2. **双激活用户的 L1 进度和纯 L1 用户几乎一样**（双激活 L1 中位数 11 节 vs 全量 L1 中位数 11 节），说明他们并没有因为多买了一个等级而分散精力
 3. **只有 7.3% 的用户 L2 进度超过 L1**，属于少数先冲 L2 的用户
 4. L2 的"低活跃"不是因为没有 L2 用户，而是因为买 L2 的人把时间先花在了 L1 上——**这是内容消费的顺序效应，不是 L2 产品本身的问题。**
 user: part2的数据需要区分L1和L2的学习进度来看数，比如L1学了0节的，L2分别学了多少，L1学了1-5节的，L2分别学了多少
 assistant: ## 双激活用户 L1 × L2 交叉分布（1,004 人）
 | L1 ↓ / L2 → | 0节 | 1-3节 | 4-5节 | 6-10节 | 11-15节 | 16-20节 | 21+节 | L1合计 |
 |---:|---:|---:|---:|---:|---:|---:|---:|---:|
 | **0节** | 61 (69%) | 12 (13%) | 2 (2%) | 3 (3%) | 6 (7%) | 2 (2%) | 3 (3%) | **89** |
 | **1-5节** | 153 (75%) | 32 (16%) | 7 (3%) | 7 (3%) | 3 (1%) | 2 (1%) | 1 (0%) | **205** |
 | **6-10节** | 163 (79%) | 17 (8%) | 9 (4%) | 7 (3%) | 6 (3%) | 2 (1%) | 2 (1%) | **206** |
 | **11-15节** | 117 (80%) | 16 (11%) | 1 (1%) | 4 (3%) | 6 (4%) | 2 (1%) | 1 (1%) | **147** |
 | **16-20节** | 94 (71%) | 19 (14%) | 4 (3%) | 6 (5%) | 3 (2%) | 4 (3%) | 2 (2%) | **132** |
 | **21-25节** | 73 (66%) | 13 (12%) | 8 (7%) | 4 (4%) | 8 (7%) | 2 (2%) | 2 (2%) | **110** |
 | **26-30节** | 45 (56%) | 10 (12%) | 5 (6%) | 6 (8%) | 3 (4%) | 7 (9%) | 4 (5%) | **80** |
 | **31+节** | 13 (37%) | 8 (23%) | 2 (6%) | 2 (6%) | 4 (11%) | 1 (3%) | 5 (14%) | **35** |
 | **L2合计** | 719 | 127 | 38 | 39 | 39 | 22 | 20 | **1004** |
 > 表中数字为人数（括号内为该 L1 区间内部的 L2 分布占比）
 ---
 ### 各 L1 区间的 L2 进度摘要
 | L1 区间 | L2 均值 | L2 中位数 | L2=0 节占比 | L2≤3 节占比 |
 |:---|---:|---:|---:|---:|
 | 0 节 | 3.0 | 0 | 68.5% | 82.0% |
 | 1-5 节 | 1.2 | 0 | 74.6% | 90.2% |
 | 6-10 节 | 1.4 | 0 | 79.1% | 87.4% |
 | 11-15 节 | 1.4 | 0 | 79.6% | 90.5% |
 | 16-20 节 | 2.0 | 0 | 71.2% | 85.6% |
 | 21-25 节 | 2.7 | 0 | 66.4% | 78.2% |
 | 26-30 节 | 4.8 | 0 | 56.2% | 68.8% |
 | 31+ 节 | 7.8 | 2 | 37.1% | 60.0% |
 ---
 ### 几个值得注意的点
 1. **L1 学得越多，L2 零进度的比例确实在下降**：从 75% 左右（L1 0-15 节）→ 56%（L1 26-30 节）→ 37%（L1 31+ 节）。有一定梯度，但即使 L1 学到 31+ 节的高活跃用户，仍有 37% 在 L2 上一节没动
 2. **L1 在 0-20 节区间内，L2 零进度率惊人地稳定在 69%-80%**——说明刚入门的用户在 L1 还没学明白之前，几乎不会碰 L2
 3. **交叉表的对角线几乎没有聚集**——不存在"L1 和 L2 同步推进"的用户群。用户行为模式是：先集中冲 L1，L1 到了一定程度后才开始碰 L2
 4. **真正 L2 高活跃（21+ 节）的用户只有 20 人**，占总体的 2%，且主要集中在 L1 进度也高的人群中
--- a/scripts/batch_course_records.py
+++ b/scripts/batch_course_records.py
@ -0,0 +1,305 @@
 #!/usr/bin/env python3
 """
 批量查询行课记录
 从 Sheet1 (55b0eb) 读取"待查询"的用户，查询行课数据并写回
 """
 import json
 import re
 import os
 import psycopg2
 import requests
 from datetime import datetime
 from collections import defaultdict
 DB_HOST = "bj-postgres-16pob4sg.sql.tencentcdb.com"
 DB_PORT = 28591
 DB_USER = "ai_member"
 DB_NAME = "vala_bi"
 SPREADSHEET_TOKEN = "RFIJsXT8FhGHhctY4RwczcOfnac"
 SHEET_ID = "55b0eb"
 CRED_DIR = "/root/.openclaw/credentials/xiaoxi"
 def get_password():
    pw = os.environ.get("PG_ONLINE_PASSWORD", "")
    if pw:
        return pw
    secrets_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "secrets.env")
    if os.path.exists(secrets_path):
        with open(secrets_path) as f:
            for line in f:
                if line.startswith("PG_ONLINE_PASSWORD="):
                    return line.strip().split("=", 1)[1].strip("'\"")
    raise RuntimeError("PG_ONLINE_PASSWORD not found")
 def get_conn():
    return psycopg2.connect(host=DB_HOST, port=DB_PORT, user=DB_USER, password=get_password(), dbname=DB_NAME, connect_timeout=30)
 def get_token():
    with open(os.path.join(CRED_DIR, "config.json")) as f:
        cfg = json.load(f)
    app_id = cfg['apps'][0]['appId']
    app_secret = cfg['apps'][0]['appSecret']
    resp = requests.post("https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal",
                         json={"app_id": app_id, "app_secret": app_secret})
    return resp.json()['tenant_access_token']
 # Experience course chapters (U00)
 EXPERIENCE_CHAPTERS = {
    # L1 U00
    343: "L1-U00-L01", 344: "L1-U00-L02", 345: "L1-U00-L03",
    346: "L1-U00-L04", 348: "L1-U00-L05",
    # L2 U00
    55: "L2-U00-L01", 56: "L2-U00-L02", 57: "L2-U00-L03",
    58: "L2-U00-L04", 59: "L2-U00-L05",
 }
 def main():
    print("=== Step 1: 读取待查询数据 ===")
    with open('/tmp/sheet_course_data.json') as f:
        data = json.load(f)
    rows = data['data']['valueRange']['values']
    pending = []
    for idx, row in enumerate(rows[1:], start=1):
        if len(row) > 2 and row[2] == '待查询':
            user_id = row[1].strip() if len(row) > 1 and row[1] else ''
            sales = row[0].strip() if len(row) > 0 and row[0] else ''
            pending.append({
                'row_idx': idx + 1,
                'user_id': user_id,
                'sales': sales,
            })
    print(f"待查询记录: {len(pending)}")
    # Filter valid user IDs
    valid_ids = []
    for r in pending:
        try:
            uid = int(r['user_id'])
            if uid > 0:
                valid_ids.append(uid)
        except (ValueError, TypeError):
            pass
    uid_set = list(set(valid_ids))
    print(f"有效用户ID: {len(uid_set)}")
    if not uid_set:
        print("无有效用户ID，退出")
        return
    conn = get_conn()
    print("\n=== Step 2: 获取角色信息 ===")
    # Get characters for these account IDs
    placeholders = ",".join(["%s"] * len(uid_set))
    cur = conn.cursor()
    cur.execute(f"""
        SELECT account_id, id AS character_id, nickname
        FROM bi_vala_app_character
        WHERE account_id IN ({placeholders})
          AND (nickname IS NOT NULL AND nickname != '')
          AND deleted_at IS NULL
    """, uid_set)
    char_rows = cur.fetchall()
    cur.close()
    account_chars = defaultdict(list)
    char_to_account = {}
    for aid, cid, nick in char_rows:
        account_chars[aid].append(cid)
        char_to_account[cid] = aid
    char_ids = list(char_to_account.keys())
    print(f"角色数: {len(char_ids)}, 覆盖账号: {len(account_chars)}")
    print("\n=== Step 3: 查询课时完成记录 ===")
    # All chapters from bi_level_unit_lesson
    cur = conn.cursor()
    cur.execute("""
        SELECT id AS chapter_id, course_level, course_season, course_unit, course_lesson
        FROM bi_level_unit_lesson
    """)
    chapter_map = {}  # chapter_id -> (course_level, course_season, course_unit, course_lesson)
    for ch_id, cl, cs, cu, cl2 in cur.fetchall():
        chapter_map[ch_id] = (cl or '', cs or '', cu or '', cl2 or '')
    cur.close()
    print(f"课程映射: {len(chapter_map)} chapters")
    # Query completed chapters from 8 shards
    char_plays = defaultdict(lambda: {'chapters': [], 'total_time_ms': 0, 'latest_time': None, 'latest_chapter': None})
    for tbl_idx in range(8):
        table = f"bi_user_chapter_play_record_{tbl_idx}"
        try:
            cur = conn.cursor()
            cur.execute(f"""
                SELECT user_id, chapter_id, created_at
                FROM {table}
                WHERE play_status = 1
                  AND deleted_at IS NULL
                  AND user_id = ANY(%s)
            """, (char_ids,))
            for user_id, ch_id, created_at in cur.fetchall():
                ch_data = chapter_map.get(ch_id)
                if ch_data:
                    rec = char_plays[user_id]
                    rec['chapters'].append((ch_id, ch_data, created_at))
                    if rec['latest_time'] is None or created_at > rec['latest_time']:
                        rec['latest_time'] = created_at
                        rec['latest_chapter'] = (ch_id, ch_data)
            cur.close()
        except Exception as e:
            print(f"  警告: {table} 查询失败: {e}")
    users_with_records = len([k for k, v in char_plays.items() if v['chapters']])
    print(f"有课时记录的角色: {users_with_records}")
    print("\n=== Step 4: 查询学习总时长 ===")
    # Sum interval_time from component play records
    for tbl_idx in range(8):
        table = f"bi_user_component_play_record_{tbl_idx}"
        try:
            cur = conn.cursor()
            cur.execute(f"""
                SELECT user_id, SUM(COALESCE(interval_time, 0)) AS total_ms
                FROM {table}
                WHERE user_id = ANY(%s)
                  AND deleted_at IS NULL
                GROUP BY user_id
            """, (char_ids,))
            for user_id, total_ms in cur.fetchall():
                if user_id in char_plays:
                    char_plays[user_id]['total_time_ms'] += (total_ms or 0)
            cur.close()
        except Exception as e:
            print(f"  警告: {table} 查询失败: {e}")
    print("\n=== Step 5: 查询付费状态 ===")
    cur = conn.cursor()
    cur.execute(f"""
        SELECT account_id, COUNT(*) AS order_count
        FROM bi_vala_order
        WHERE account_id IN ({placeholders})
          AND pay_success_date IS NOT NULL
          AND order_status = 3
          AND deleted_at IS NULL
        GROUP BY account_id
    """, uid_set)
    paid_accounts = {row[0]: row[1] for row in cur.fetchall()}
    cur.close()
    print(f"已付费账号: {len(paid_accounts)}")
    print("\n=== Step 6: 查询激活状态 (A1/A2) ===")
    cur = conn.cursor()
    cur.execute(f"""
        SELECT t.account_id, t.season_package_level
        FROM bi_vala_seasonal_ticket t
        INNER JOIN bi_vala_app_account a ON t.account_id = a.id AND a.status = 1
        WHERE t.account_id IN ({placeholders})
          AND t.status = 1
          AND t.deleted_at IS NULL
          AND t.season_package_level IN ('A1', 'A2')
    """, uid_set)
    account_activation = {}
    for aid, lvl in cur.fetchall():
        if aid not in account_activation:
            account_activation[aid] = lvl
    cur.close()
    print(f"有激活记录: {len(account_activation)}")
    conn.close()
    print("\n=== Step 7: 组装结果 ===")
    now_str = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    results = []
    for r in pending:
        try:
            uid = int(r['user_id'])
        except (ValueError, TypeError):
            results.append({
                'row_idx': r['row_idx'],
                'record': '无有效用户ID',
                'update_time': now_str,
            })
            continue
        # Find characters for this account
        char_list = account_chars.get(uid, [])
        # Find latest chapter across all characters
        best_char = None
        best_latest_time = None
        best_latest_chapter = None
        best_chapters = []
        total_time_ms = 0
        for cid in char_list:
            play = char_plays.get(cid)
            if play and play['latest_time']:
                if best_latest_time is None or play['latest_time'] > best_latest_time:
                    best_latest_time = play['latest_time']
                    best_latest_chapter = play['latest_chapter']
                    best_char = cid
                best_chapters.extend(play['chapters'])
                total_time_ms += play['total_time_ms']
        if not best_char or not best_latest_chapter:
            # No study records
            paid = '已付费' if uid in paid_accounts else '未付费'
            results.append({
                'row_idx': r['row_idx'],
                'record': f"销售:{r['sales']} | 用户:{uid} | 当前:无行课记录 | 最近行课:无 | 学习0min | {paid}",
                'update_time': now_str,
            })
            continue
        # Format latest chapter
        ch_id, (course_level, course_season, course_unit, course_lesson) = best_latest_chapter
        activation = account_activation.get(uid, '')
        # Build course label
        # Format: L1-S0-U00-L01
        full_label = f"{course_level}-{course_season}-{course_unit}-{course_lesson}"
        if activation:
            prefix = f"{activation}体验课"
        else:
            prefix = f"{course_level}体验课"
        current = f"{prefix}-{full_label}"
        # Recent date
        recent_date = best_latest_time.strftime('%Y-%m-%d') if best_latest_time else '无'
        # Total time in minutes
        total_min = round(total_time_ms / 60000, 1)
        if total_min == int(total_min):
            total_min = int(total_min)
        paid = '已付费' if uid in paid_accounts else '未付费'
        record = f"销售:{r['sales']} | 用户:{uid} | 当前:{current} | 最近行课:{recent_date} | 学习{total_min}min | {paid}"
        results.append({
            'row_idx': r['row_idx'],
            'record': record,
            'update_time': now_str,
        })
    print(f"生成结果: {len(results)} 条")
    # Sample output
    for r in results[:5]:
        print(f"  Row {r['row_idx']}: {r['record'][:120]}...")
    # Save results
    with open('/tmp/sheet_course_results.json', 'w') as f:
        json.dump({'results': results, 'total': len(results)}, f, ensure_ascii=False, indent=2)
    print(f"结果已保存到 /tmp/sheet_course_results.json")
 if __name__ == "__main__":
    main()
--- a/scripts/batch_phone_to_id.py
+++ b/scripts/batch_phone_to_id.py
@ -0,0 +1,145 @@
 #!/usr/bin/env python3
 """
 批量手机号→用户ID匹配
 输入: /tmp/sheet_id_data.json (飞书sheet原始数据)
 输出: /tmp/sheet_id_results.json (回填数据)
 """
 import json
 import re
 import os
 import psycopg2
 from datetime import datetime
 DB_HOST = "bj-postgres-16pob4sg.sql.tencentcdb.com"
 DB_PORT = 28591
 DB_USER = "ai_member"
 DB_NAME = "vala_bi"
 def get_password():
    pw = os.environ.get("PG_ONLINE_PASSWORD", "")
    if pw:
        return pw
    secrets_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "secrets.env")
    if os.path.exists(secrets_path):
        with open(secrets_path) as f:
            for line in f:
                if line.startswith("PG_ONLINE_PASSWORD="):
                    return line.strip().split("=", 1)[1].strip("'\"")
    raise RuntimeError("PG_ONLINE_PASSWORD not found")
 def mask_phone(phone):
    phone = str(phone).strip()
    if len(phone) >= 7:
        return f"{phone[:3]}****{phone[-4:]}"
    return phone
 def match_phones_to_accounts(phones, conn):
    """手机号脱敏后匹配 bi_vala_app_account.tel"""
    if not phones:
        return {}, []
    # 脱敏
    masked_to_phones = {}
    for p in phones:
        m = mask_phone(p)
        masked_to_phones.setdefault(m, []).append(p)
    masks = list(masked_to_phones.keys())
    phone_to_account = {}
    # 分批查询 (每批最多500个)
    for i in range(0, len(masks), 500):
        batch = masks[i:i+500]
        placeholders = ",".join(["%s"] * len(batch))
        cur = conn.cursor()
        cur.execute(f"""
            SELECT id AS account_id, tel
            FROM bi_vala_app_account
            WHERE tel IN ({placeholders})
              AND status = 1
              AND deleted_at IS NULL
        """, batch)
        rows = cur.fetchall()
        cur.close()
        # masked -> account_id
        masked_to_account = {r[1]: r[0] for r in rows}
        for p_list in masked_to_phones.values():
            for p in p_list:
                m = mask_phone(p)
                if m in masked_to_account:
                    phone_to_account[p] = masked_to_account[m]
    return phone_to_account
 def main():
    # 读取数据
    with open('/tmp/sheet_id_data.json') as f:
        data = json.load(f)
    rows = data['data']['valueRange']['values']
    headers = rows[0]
    # 找出待查询ID的行
    pending_rows = []
    for idx, row in enumerate(rows[1:], start=1):  # row 0 is header, idx 1-based
        if len(row) > 4 and row[4] == '待查询ID':
            phone = row[1].strip() if len(row) > 1 and row[1] else ''
            pending_rows.append({
                'row_idx': idx + 1,  # 1-based in sheet
                'phone': phone,
                'sales': row[0] if len(row) > 0 else '',
                'month': row[2] if len(row) > 2 else '',
                'query_key': row[3] if len(row) > 3 else '',
                'notes': row[8] if len(row) > 8 else '',
            })
    print(f"待查询ID记录: {len(pending_rows)}")
    # 提取唯一手机号
    phones = list(set(r['phone'] for r in pending_rows if r['phone']))
    print(f"唯一手机号: {len(phones)}")
    # 匹配
    conn = psycopg2.connect(host=DB_HOST, port=DB_PORT, user=DB_USER, password=get_password(), dbname=DB_NAME, connect_timeout=30)
    phone_to_account = match_phones_to_accounts(phones, conn)
    conn.close()
    matched = sum(1 for p in phones if p in phone_to_account)
    unmatched = len(phones) - matched
    print(f"匹配成功: {matched}, 未匹配: {unmatched}")
    # 生成结果
    now_str = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    results = []
    stats = {'matched': 0, 'unmatched': 0}
    for r in pending_rows:
        phone = r['phone']
        if phone in phone_to_account:
            uid = str(phone_to_account[phone])
            results.append({
                'row_idx': r['row_idx'],
                'user_id': uid,
                'status': '已回填',
                'update_time': now_str,
            })
            stats['matched'] += 1
        else:
            results.append({
                'row_idx': r['row_idx'],
                'user_id': '',
                'status': '未查到',
                'update_time': now_str,
            })
            stats['unmatched'] += 1
    # 保存结果
    with open('/tmp/sheet_id_results.json', 'w') as f:
        json.dump({'results': results, 'stats': stats, 'total': len(results)}, f, ensure_ascii=False, indent=2)
    print(f"\n结果统计: 匹配 {stats['matched']}, 未查到 {stats['unmatched']}")
    print(f"结果已保存到 /tmp/sheet_id_results.json")
 if __name__ == "__main__":
    main()
--- a/scripts/dual_activation_progress.py
+++ b/scripts/dual_activation_progress.py
@ -0,0 +1,314 @@
 #!/usr/bin/env python3
 """查询激活L2课程的用户中，有多少也激活了L1，以及他们的L1/L2学习进度对比"""
 import psycopg2
 import statistics
 import sys
 conn = psycopg2.connect(
    host="bj-postgres-16pob4sg.sql.tencentcdb.com",
    port=28591,
    user="ai_member",
    password="LdfjdjL83h3h3^$&**YGG*",
    dbname="vala_bi"
 )
 cur = conn.cursor()
 # ========================================
 # Part 1: 激活L2的用户中有多少也激活了L1
 # ========================================
 print("=" * 60)
 print("Part 1: L2用户中同时激活L1的比例")
 print("=" * 60)
 sql1 = """
 WITH orders AS (
    SELECT o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1
 ),
 l2_tickets AS (
    SELECT DISTINCT o.account_id
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 
      AND t.deleted_at IS NULL
      AND t.season_package_level = 'A2'
 ),
 l1_tickets AS (
    SELECT DISTINCT o.account_id
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 
      AND t.deleted_at IS NULL
      AND t.season_package_level = 'A1'
 ),
 all_l1_anytime AS (
    -- L1激活（不限时间）
    SELECT DISTINCT account_id 
    FROM l2_tickets l2
    WHERE EXISTS (
        SELECT 1 FROM bi_vala_seasonal_ticket t
        JOIN bi_vala_order o ON t.out_trade_no = o.out_trade_no
        WHERE t.status = 1 AND t.deleted_at IS NULL 
          AND t.season_package_level = 'A1'
          AND o.account_id = l2.account_id
    )
 )
 SELECT 
    (SELECT COUNT(*) FROM l2_tickets) AS total_l2_accounts,
    (SELECT COUNT(*) FROM all_l1_anytime) AS l1_l2_both,
    (SELECT COUNT(*) FROM l1_tickets WHERE account_id IN (SELECT account_id FROM l2_tickets)) AS l1_l2_both_same_period;
 """
 cur.execute(sql1)
 r1 = cur.fetchone()
 total_l2 = r1[0]
 both_any = r1[1]
 both_period = r1[2]
 print(f"激活L2的账户数（2026-03后订单）: {total_l2}")
 print(f"其中也激活了L1的账户数（不限时间）: {both_any} ({both_any/total_l2*100:.1f}%)")
 print(f"其中也通过2026-03后订单激活了L1: {both_period} ({both_period/total_l2*100:.1f}%)")
 # ========================================
 # Part 2: 双激活用户的学习进度对比
 # ========================================
 print(f"\n{'='*60}")
 print("Part 2: 双激活用户 L1 vs L2 学习进度对比（下单30天内）")
 print("=" * 60)
 sql2 = """
 WITH orders AS (
    SELECT o.id, o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1
 ),
 l2_accounts AS (
    SELECT DISTINCT o.account_id
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 AND t.deleted_at IS NULL AND t.season_package_level = 'A2'
 ),
 dual_tickets AS (
    -- 只取同时激活了L1和L2的账户的tickets
    SELECT o.out_trade_no, o.account_id, o.pay_success_date,
           t.character_id, t.season_package_level
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 
      AND t.deleted_at IS NULL
      AND t.season_package_level IN ('A1', 'A2')
      AND o.account_id IN (
          SELECT account_id FROM l2_accounts
          INTERSECT
          SELECT DISTINCT o2.account_id FROM orders o2
          JOIN bi_vala_seasonal_ticket t2 ON o2.out_trade_no = t2.out_trade_no
          WHERE t2.status = 1 AND t2.deleted_at IS NULL AND t2.season_package_level = 'A1'
      )
 ),
 all_chapter_records AS (
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_0 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_1 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_2 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_3 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_4 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_5 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_6 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_7 WHERE play_status = 1
 ),
 level_records AS (
    SELECT pr.user_id, pr.chapter_id, pr.created_at, l.course_level
    FROM all_chapter_records pr
    JOIN bi_level_unit_lesson l ON pr.chapter_id = l.id
    WHERE l.course_level IN ('L1', 'L2')
 ),
 ticket_progress AS (
    SELECT 
        t.out_trade_no,
        t.account_id,
        t.character_id,
        t.season_package_level,
        COUNT(DISTINCT lr.chapter_id) AS completed_lessons
    FROM dual_tickets t
    LEFT JOIN level_records lr 
        ON t.character_id = lr.user_id
        AND lr.course_level = CASE 
            WHEN t.season_package_level = 'A1' THEN 'L1' 
            WHEN t.season_package_level = 'A2' THEN 'L2' 
        END
        AND lr.created_at >= t.pay_success_date
        AND lr.created_at <= t.pay_success_date + INTERVAL '30 days'
    GROUP BY t.out_trade_no, t.account_id, t.character_id, t.season_package_level
 )
 SELECT 
    CASE WHEN season_package_level = 'A1' THEN 'L1' ELSE 'L2' END AS course_level,
    completed_lessons,
    COUNT(*) AS ticket_count
 FROM ticket_progress
 GROUP BY season_package_level, completed_lessons
 ORDER BY season_package_level, completed_lessons;
 """
 cur.execute(sql2)
 rows = cur.fetchall()
 l1_rows = [(r[1], r[2]) for r in rows if r[0] == 'L1']
 l2_rows = [(r[1], r[2]) for r in rows if r[0] == 'L2']
 for label, data in [('L1', l1_rows), ('L2', l2_rows)]:
    total = sum(r[1] for r in data)
    print(f"\n--- {label}（双激活用户，仅2026-03后订单ticket）---")
    print(f"总 ticket 数: {total}")
    # 分桶展示
    buckets = [(0,0), (1,5), (6,10), (11,15), (16,20), (21,25), (26,30), (31,999)]
    print(f"{'区间':>12} | {'ticket数':>10} | {'占比':>8}")
    print("-" * 35)
    for lo, hi in buckets:
        if lo == hi == 0:
            cnt = sum(r[1] for r in data if r[0] == 0)
            label_b = "0节"
        elif hi == 999:
            cnt = sum(r[1] for r in data if r[0] >= lo)
            label_b = f"{lo}+"
        else:
            cnt = sum(r[1] for r in data if lo <= r[0] <= hi)
            label_b = f"{lo}-{hi}"
        pct = cnt / total * 100 if total > 0 else 0
        print(f"{label_b:>12} | {cnt:>10} | {pct:>7.1f}%")
    # 统计
    values = []
    for v, c in data:
        values.extend([v] * c)
    if values:
        avg = sum(values) / len(values)
        med = statistics.median(values)
        zero_pct = sum(1 for v in values if v == 0) / len(values) * 100
        p25 = sorted(values)[len(values)//4]
        p75 = sorted(values)[len(values)*3//4]
        print(f"\n平均={avg:.1f} | 中位数={med:.0f} | P25={p25} | P75={p75} | 0节占比={zero_pct:.1f}%")
 # ========================================
 # Part 3: 同一用户 L1 vs L2 进度对比
 # ========================================
 print(f"\n{'='*60}")
 print("Part 3: 同一用户 L1 vs L2 进度逐用户对比")
 print("=" * 60)
 sql3 = """
 WITH orders AS (
    SELECT o.id, o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1
 ),
 l2_accounts AS (
    SELECT DISTINCT o.account_id
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 AND t.deleted_at IS NULL AND t.season_package_level = 'A2'
 ),
 dual_accounts AS (
    SELECT account_id FROM l2_accounts
    INTERSECT
    SELECT DISTINCT o2.account_id FROM orders o2
    JOIN bi_vala_seasonal_ticket t2 ON o2.out_trade_no = t2.out_trade_no
    WHERE t2.status = 1 AND t2.deleted_at IS NULL AND t2.season_package_level = 'A1'
 ),
 dual_tickets AS (
    SELECT o.out_trade_no, o.account_id, o.pay_success_date,
           t.character_id, t.season_package_level
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 AND t.deleted_at IS NULL
      AND t.season_package_level IN ('A1', 'A2')
      AND o.account_id IN (SELECT account_id FROM dual_accounts)
 ),
 all_chapter_records AS (
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_0 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_1 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_2 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_3 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_4 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_5 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_6 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_7 WHERE play_status = 1
 ),
 level_records AS (
    SELECT pr.user_id, pr.chapter_id, pr.created_at, l.course_level
    FROM all_chapter_records pr
    JOIN bi_level_unit_lesson l ON pr.chapter_id = l.id
    WHERE l.course_level IN ('L1', 'L2')
 ),
 ticket_progress AS (
    SELECT 
        t.out_trade_no,
        t.account_id,
        t.character_id,
        t.season_package_level,
        COUNT(DISTINCT lr.chapter_id) AS completed_lessons
    FROM dual_tickets t
    LEFT JOIN level_records lr 
        ON t.character_id = lr.user_id
        AND lr.course_level = CASE 
            WHEN t.season_package_level = 'A1' THEN 'L1' 
            WHEN t.season_package_level = 'A2' THEN 'L2' 
        END
        AND lr.created_at >= t.pay_success_date
        AND lr.created_at <= t.pay_success_date + INTERVAL '30 days'
    GROUP BY t.out_trade_no, t.account_id, t.character_id, t.season_package_level
 ),
 account_level_progress AS (
    -- 每个用户每个level取所有ticket中完课数的最大值（取最好的那个ticket）
    SELECT 
        account_id,
        CASE WHEN season_package_level = 'A1' THEN 'L1' ELSE 'L2' END AS course_level,
        MAX(completed_lessons) AS max_completed
    FROM ticket_progress
    GROUP BY account_id, season_package_level
 )
 SELECT 
    COALESCE(l1.max_completed, 0) AS l1_lessons,
    COALESCE(l2.max_completed, 0) AS l2_lessons,
    COUNT(*) AS account_count
 FROM (SELECT DISTINCT account_id FROM dual_accounts) da
 LEFT JOIN account_level_progress l1 ON da.account_id = l1.account_id AND l1.course_level = 'L1'
 LEFT JOIN account_level_progress l2 ON da.account_id = l2.account_id AND l2.course_level = 'L2'
 GROUP BY COALESCE(l1.max_completed, 0), COALESCE(l2.max_completed, 0)
 ORDER BY l1_lessons, l2_lessons;
 """
 cur.execute(sql3)
 rows3 = cur.fetchall()
 print(f"\nL1进度 vs L2进度（每个用户取该level最佳ticket的完课数）")
 print(f"{'L1节数':>8} | {'L2节数':>8} | {'用户数':>8}")
 print("-" * 30)
 for r in rows3:
    print(f"{r[0]:>8} | {r[1]:>8} | {r[2]:>8}")
 # Summary: L1 > L2, L1 < L2, L1 = L2
 l1_gt = sum(r[2] for r in rows3 if r[0] > r[1])
 l2_gt = sum(r[2] for r in rows3 if r[0] < r[1])
 equal = sum(r[2] for r in rows3 if r[0] == r[1])
 total_dual = sum(r[2] for r in rows3)
 print(f"\n总结（用户级）:")
 print(f"  L1进度 > L2进度: {l1_gt} 人 ({l1_gt/total_dual*100:.1f}%)")
 print(f"  L1进度 < L2进度: {l2_gt} 人 ({l2_gt/total_dual*100:.1f}%)")
 print(f"  L1进度 = L2进度: {equal} 人 ({equal/total_dual*100:.1f}%)")
 # Both zero
 both_zero = sum(r[2] for r in rows3 if r[0] == 0 and r[1] == 0)
 print(f"  两者均为0节: {both_zero} 人 ({both_zero/total_dual*100:.1f}%)")
 cur.close()
 conn.close()
 print("\n查询完成。")
--- a/scripts/dual_progress_cross.py
+++ b/scripts/dual_progress_cross.py
@ -0,0 +1,199 @@
 #!/usr/bin/env python3
 """双激活用户 L1 vs L2 学习进度交叉分布 —— 按L1区间看L2分布"""
 import psycopg2
 conn = psycopg2.connect(
    host="bj-postgres-16pob4sg.sql.tencentcdb.com",
    port=28591,
    user="ai_member",
    password="LdfjdjL83h3h3^$&**YGG*",
    dbname="vala_bi"
 )
 cur = conn.cursor()
 sql = """
 WITH orders AS (
    SELECT o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1
 ),
 l2_accounts AS (
    SELECT DISTINCT o.account_id
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 AND t.deleted_at IS NULL AND t.season_package_level = 'A2'
 ),
 dual_accounts AS (
    SELECT account_id FROM l2_accounts
    INTERSECT
    SELECT DISTINCT o2.account_id FROM orders o2
    JOIN bi_vala_seasonal_ticket t2 ON o2.out_trade_no = t2.out_trade_no
    WHERE t2.status = 1 AND t2.deleted_at IS NULL AND t2.season_package_level = 'A1'
 ),
 dual_tickets AS (
    SELECT o.out_trade_no, o.account_id, o.pay_success_date,
           t.character_id, t.season_package_level
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 AND t.deleted_at IS NULL
      AND t.season_package_level IN ('A1', 'A2')
      AND o.account_id IN (SELECT account_id FROM dual_accounts)
 ),
 all_chapter_records AS (
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_0 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_1 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_2 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_3 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_4 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_5 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_6 WHERE play_status = 1
    UNION ALL SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_7 WHERE play_status = 1
 ),
 level_records AS (
    SELECT pr.user_id, pr.chapter_id, pr.created_at, l.course_level
    FROM all_chapter_records pr
    JOIN bi_level_unit_lesson l ON pr.chapter_id = l.id
    WHERE l.course_level IN ('L1', 'L2')
 ),
 ticket_progress AS (
    SELECT 
        t.account_id,
        t.character_id,
        t.season_package_level,
        COUNT(DISTINCT lr.chapter_id) AS completed_lessons
    FROM dual_tickets t
    LEFT JOIN level_records lr 
        ON t.character_id = lr.user_id
        AND lr.course_level = CASE 
            WHEN t.season_package_level = 'A1' THEN 'L1' 
            WHEN t.season_package_level = 'A2' THEN 'L2' 
        END
        AND lr.created_at >= t.pay_success_date
        AND lr.created_at <= t.pay_success_date + INTERVAL '30 days'
    GROUP BY t.account_id, t.character_id, t.season_package_level
 ),
 account_level_progress AS (
    SELECT 
        account_id,
        CASE WHEN season_package_level = 'A1' THEN 'L1' ELSE 'L2' END AS course_level,
        MAX(completed_lessons) AS max_completed
    FROM ticket_progress
    GROUP BY account_id, season_package_level
 )
 SELECT 
    COALESCE(l1.max_completed, 0) AS l1_lessons,
    COALESCE(l2.max_completed, 0) AS l2_lessons,
    COUNT(*) AS account_count
 FROM (SELECT DISTINCT account_id FROM dual_accounts) da
 LEFT JOIN account_level_progress l1 ON da.account_id = l1.account_id AND l1.course_level = 'L1'
 LEFT JOIN account_level_progress l2 ON da.account_id = l2.account_id AND l2.course_level = 'L2'
 GROUP BY COALESCE(l1.max_completed, 0), COALESCE(l2.max_completed, 0)
 ORDER BY l1_lessons, l2_lessons;
 """
 print("正在查询...")
 cur.execute(sql)
 rows = cur.fetchall()
 # 定义 L1 分桶
 def bucket_l1(v):
    if v == 0: return (0, "0节")
    elif v <= 5: return (1, "1-5节")
    elif v <= 10: return (2, "6-10节")
    elif v <= 15: return (3, "11-15节")
    elif v <= 20: return (4, "16-20节")
    elif v <= 25: return (5, "21-25节")
    elif v <= 30: return (6, "26-30节")
    else: return (7, "31+节")
 # 定义 L2 分桶（更细一些因为L2集中在低段）
 def bucket_l2(v):
    if v == 0: return (0, "0节")
    elif v <= 3: return (1, "1-3节")
    elif v <= 5: return (2, "4-5节")
    elif v <= 10: return (3, "6-10节")
    elif v <= 15: return (4, "11-15节")
    elif v <= 20: return (5, "16-20节")
    else: return (6, "21+节")
 # 构建交叉表
 l1_buckets = [(0,"0节"),(1,"1-5节"),(2,"6-10节"),(3,"11-15节"),(4,"16-20节"),(5,"21-25节"),(6,"26-30节"),(7,"31+节")]
 l2_buckets = [(0,"0节"),(1,"1-3节"),(2,"4-5节"),(3,"6-10节"),(4,"11-15节"),(5,"16-20节"),(6,"21+节")]
 # 聚合到桶
 from collections import defaultdict
 cross = defaultdict(lambda: defaultdict(int))
 l1_row_total = defaultdict(int)
 l2_col_total = defaultdict(int)
 for l1_val, l2_val, cnt in rows:
    l1b = bucket_l1(l1_val)[0]
    l2b = bucket_l2(l2_val)[0]
    cross[l1b][l2b] += cnt
    l1_row_total[l1b] += cnt
    l2_col_total[l2b] += cnt
 total = sum(l1_row_total.values())
 # 打印
 print(f"\n{'='*80}")
 print(f"双激活用户 L1 学习进度 × L2 学习进度 交叉分布（{total} 人）")
 print(f"{'='*80}\n")
 # 表头
 header = f"{'L1 ↓ / L2 →':>14}"
 for _, l2l in l2_buckets:
    header += f" | {l2l:>8}"
 header += f" | {'L1合计':>8}"
 print(header)
 print("-" * len(header))
 for l1k, l1l in l1_buckets:
    line = f"{l1l:>14}"
    row_total = l1_row_total[l1k]
    for l2k, _ in l2_buckets:
        val = cross[l1k][l2k]
        if row_total > 0:
            pct = val / row_total * 100
            line += f" | {val:>4}({pct:>4.0f}%)"
        else:
            line += f" | {val:>4}(  0%)"
    line += f" | {row_total:>8}"
    print(line)
 # 列合计行
 print("-" * len(header))
 line = f"{'L2合计':>14}"
 for l2k, _ in l2_buckets:
    line += f" | {l2_col_total[l2k]:>8}"
 line += f" | {total:>8}"
 print(line)
 # 对每个 L1 区间补充：L2 的均值和中位数
 print(f"\n{'='*80}")
 print("各 L1 区间下的 L2 学习进度摘要")
 print(f"{'='*80}")
 print(f"{'L1区间':>12} | {'L2均值':>8} | {'L2中位':>8} | {'L2=0节%':>8} | {'L2≤3节%':>9}")
 print("-" * 55)
 import statistics
 for l1k, l1l in l1_buckets:
    vals = []
    for l1_val, l2_val, cnt in rows:
        if bucket_l1(l1_val)[0] == l1k:
            vals.extend([l2_val] * cnt)
    if vals:
        avg = sum(vals) / len(vals)
        med = statistics.median(vals)
        zero = sum(1 for v in vals if v == 0) / len(vals) * 100
        le3 = sum(1 for v in vals if v <= 3) / len(vals) * 100
        print(f"{l1l:>12} | {avg:>7.1f} | {med:>7.0f} | {zero:>7.1f}% | {le3:>8.1f}%")
 cur.close()
 conn.close()
 print("\n查询完成。")
--- a/scripts/learning_progress_30days.py
+++ b/scripts/learning_progress_30days.py
@ -0,0 +1,120 @@
 #!/usr/bin/env python3
 """查询2026年3月后未退费订单，激活课程在下单30天内的学习进度分布"""
 import psycopg2
 import os
 import sys
 # 数据库连接
 conn = psycopg2.connect(
    host="bj-postgres-16pob4sg.sql.tencentcdb.com",
    port=28591,
    user="ai_member",
    password="LdfjdjL83h3h3^$&**YGG*",
    dbname="vala_bi"
 )
 cur = conn.cursor()
 sql = """WITH orders AS (
    SELECT o.id, o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1
 ),
 tickets AS (
    SELECT o.out_trade_no, o.account_id, o.pay_success_date,
           t.character_id, t.season_package_level
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 
      AND t.deleted_at IS NULL
      AND t.season_package_level IN ('A1', 'A2')
 ),
 all_chapter_records AS (
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_0 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_1 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_2 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_3 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_4 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_5 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_6 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_7 WHERE play_status = 1
 ),
 level_records AS (
    SELECT pr.user_id, pr.chapter_id, pr.created_at, l.course_level
    FROM all_chapter_records pr
    JOIN bi_level_unit_lesson l ON pr.chapter_id = l.id
    WHERE l.course_level IN ('L1', 'L2')
 ),
 ticket_progress AS (
    SELECT 
        t.out_trade_no,
        t.account_id,
        t.character_id,
        t.season_package_level,
        t.pay_success_date,
        COUNT(DISTINCT lr.chapter_id) AS completed_lessons
    FROM tickets t
    LEFT JOIN level_records lr 
        ON t.character_id = lr.user_id
        AND lr.course_level = CASE 
            WHEN t.season_package_level = 'A1' THEN 'L1' 
            WHEN t.season_package_level = 'A2' THEN 'L2' 
        END
        AND lr.created_at >= t.pay_success_date
        AND lr.created_at <= t.pay_success_date + INTERVAL '30 days'
    GROUP BY t.out_trade_no, t.account_id, t.character_id, t.season_package_level, t.pay_success_date
 )
 SELECT 
    CASE WHEN season_package_level = 'A1' THEN 'L1' ELSE 'L2' END AS course_level,
    completed_lessons,
    COUNT(*) AS ticket_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY season_package_level), 1) AS pct
 FROM ticket_progress
 GROUP BY season_package_level, completed_lessons
 ORDER BY season_package_level, completed_lessons;"""
 print("正在查询...")
 cur.execute(sql)
 rows = cur.fetchall()
 print(f"\n{'='*60}")
 print("2026年3月后未退费订单 → 激活课程下单30天内学习进度分布")
 print(f"{'='*60}")
 l1_rows = [r for r in rows if r[0] == 'L1']
 l2_rows = [r for r in rows if r[0] == 'L2']
 for label, data in [('L1', l1_rows), ('L2', l2_rows)]:
    print(f"\n--- {label} ---")
    total = sum(r[2] for r in data)
    print(f"总 ticket 数: {total}")
    print(f"{'完课数':>8} | {'ticket数':>10} | {'占比':>8}")
    print("-" * 35)
    for r in data:
        print(f"{r[1]:>8} | {r[2]:>10} | {r[3]:>7}%")
    # 统计摘要
    if data:
        completed_values = []
        for r in data:
            completed_values.extend([r[1]] * r[2])
        import statistics
        avg = sum(completed_values) / len(completed_values) if completed_values else 0
        median = statistics.median(completed_values) if completed_values else 0
        zero_pct = sum(1 for v in completed_values if v == 0) / len(completed_values) * 100 if completed_values else 0
        print(f"\n摘要: 平均={avg:.1f}节, 中位数={median:.0f}节, 0节课占比={zero_pct:.1f}%")
 cur.close()
 conn.close()
 print("\n查询完成。")
--- a/scripts/learning_progress_30days.sql
+++ b/scripts/learning_progress_30days.sql
@ -0,0 +1,77 @@
 -- 2026年3月后未退费订单，激活课程在下单30天内的学习进度分布
 -- 区分 L1 和 L2
 WITH orders AS (
    -- 2026-03-01起，状态非已退款（order_status!=4）的订单
    SELECT o.id, o.out_trade_no, o.account_id, o.pay_success_date
    FROM bi_vala_order o
    JOIN bi_vala_app_account a ON o.account_id = a.id
    WHERE o.pay_success_date >= '2026-03-01'
      AND o.order_status != 4
      AND o.pay_success_date IS NOT NULL
      AND a.status = 1  -- 排除测试账号
 ),
 tickets AS (
    -- 订单关联的已激活课程票
    SELECT o.out_trade_no, o.account_id, o.pay_success_date,
           t.character_id, t.season_package_level
    FROM orders o
    JOIN bi_vala_seasonal_ticket t ON o.out_trade_no = t.out_trade_no
    WHERE t.status = 1 
      AND t.deleted_at IS NULL
      AND t.season_package_level IN ('A1', 'A2')
 ),
 all_chapter_records AS (
    -- 合并8张分表的完课记录
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_0 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_1 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_2 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_3 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_4 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_5 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_6 WHERE play_status = 1
    UNION ALL
    SELECT user_id, chapter_id, created_at FROM bi_user_chapter_play_record_7 WHERE play_status = 1
 ),
 level_records AS (
    -- 完课记录关联课程等级
    SELECT pr.user_id, pr.chapter_id, pr.created_at, l.course_level
    FROM all_chapter_records pr
    JOIN bi_level_unit_lesson l ON pr.chapter_id = l.id
    WHERE l.course_level IN ('L1', 'L2')
 ),
 ticket_progress AS (
    -- 每个ticket（订单+角色+等级）在下单30天内的完课数
    SELECT 
        t.out_trade_no,
        t.account_id,
        t.character_id,
        t.season_package_level,
        t.pay_success_date,
        COUNT(DISTINCT lr.chapter_id) AS completed_lessons
    FROM tickets t
    LEFT JOIN level_records lr 
        ON t.character_id = lr.user_id
        AND lr.course_level = CASE 
            WHEN t.season_package_level = 'A1' THEN 'L1' 
            WHEN t.season_package_level = 'A2' THEN 'L2' 
        END
        AND lr.created_at >= t.pay_success_date
        AND lr.created_at <= t.pay_success_date + INTERVAL '30 days'
    GROUP BY t.out_trade_no, t.account_id, t.character_id, t.season_package_level, t.pay_success_date
 )
 -- 分布统计
 SELECT 
    CASE WHEN season_package_level = 'A1' THEN 'L1' ELSE 'L2' END AS course_level,
    completed_lessons,
    COUNT(*) AS ticket_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY season_package_level), 1) AS pct
 FROM ticket_progress
 GROUP BY season_package_level, completed_lessons
 ORDER BY season_package_level, completed_lessons;
--- a/scripts/write_course_results.py
+++ b/scripts/write_course_results.py
@ -0,0 +1,105 @@
 #!/usr/bin/env python3
 """
 将行课记录结果分组写入飞书表格 (Sheet1 55b0eb)
 """
 import json
 import requests
 import os
 SPREADSHEET_TOKEN = "RFIJsXT8FhGHhctY4RwczcOfnac"
 SHEET_ID = "55b0eb"
 CRED_DIR = "/root/.openclaw/credentials/xiaoxi"
 def get_token():
    with open(os.path.join(CRED_DIR, "config.json")) as f:
        cfg = json.load(f)
    app_id = cfg['apps'][0]['appId']
    app_secret = cfg['apps'][0]['appSecret']
    resp = requests.post("https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal",
                         json={"app_id": app_id, "app_secret": app_secret})
    return resp.json()['tenant_access_token']
 def write_range(token, range_str, values):
    """Write values to a sheet range"""
    url = f"https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{SPREADSHEET_TOKEN}/values"
    body = {
        "valueRange": {
            "range": f"{SHEET_ID}!{range_str}",
            "values": values
        }
    }
    resp = requests.put(url, headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }, json=body)
    result = resp.json()
    return result.get('code') == 0
 def main():
    with open('/tmp/sheet_course_results.json') as f:
        data = json.load(f)
    results = data['results']
    results.sort(key=lambda x: x['row_idx'])
    # Group consecutive rows into batches
    batches = []
    current_batch = []
    prev_row = None
    for r in results:
        if prev_row is not None and r['row_idx'] != prev_row + 1:
            # Gap found, start new batch
            if current_batch:
                batches.append(current_batch)
            current_batch = [r]
        else:
            current_batch.append(r)
        prev_row = r['row_idx']
    if current_batch:
        batches.append(current_batch)
    print(f"Total batches: {len(batches)}, total rows: {len(results)}")
    token = get_token()
    print("Token obtained")
    written = 0
    failed = 0
    for idx, batch in enumerate(batches):
        start_row = batch[0]['row_idx']
        end_row = batch[-1]['row_idx']
        # Build values array: [record, update_time] for each row
        # For contiguous batches, fill all rows
        values = []
        batch_idx = 0
        for row_num in range(start_row, end_row + 1):
            if batch_idx < len(batch) and batch[batch_idx]['row_idx'] == row_num:
                values.append([batch[batch_idx]['record'], batch[batch_idx]['update_time']])
                batch_idx += 1
            else:
                # Empty cell for non-target rows (important for non-contiguous fills in contiguous range)
                values.append(['', ''])
        range_str = f"D{start_row}:E{end_row}"
        if write_range(token, range_str, values):
            written += len(batch)
            if (idx + 1) % 20 == 0:
                print(f"  ... {idx+1}/{len(batches)} batches done, {written} rows written")
        else:
            failed += len(batch)
            print(f"  FAILED batch {idx+1}: rows {start_row}-{end_row} ({len(batch)} rows)")
    print(f"\nDone: {written} wrote, {failed} failed")
    # Show sample result from first batch
    if batches:
        sample = batches[0][0]
        print(f"\nSample: Row {sample['row_idx']}: {sample['record'][:100]}")
 if __name__ == "__main__":
    main()
--- a/scripts/write_id_results.py
+++ b/scripts/write_id_results.py
@ -0,0 +1,70 @@
 #!/usr/bin/env python3
 """
 将手机号→ID匹配结果写回飞书表格
 """
 import json
 import requests
 import os
 import sys
 SPREADSHEET_TOKEN = "RFIJsXT8FhGHhctY4RwczcOfnac"
 SHEET_ID = "2DOxEI"
 CRED_DIR = "/root/.openclaw/credentials/xiaoxi"
 def get_token():
    with open(os.path.join(CRED_DIR, "config.json")) as f:
        cfg = json.load(f)
    app_id = cfg['apps'][0]['appId']
    app_secret = cfg['apps'][0]['appSecret']
    resp = requests.post("https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal",
                         json={"app_id": app_id, "app_secret": app_secret})
    return resp.json()['tenant_access_token']
 def write_values(token, range_str, values):
    """Write values to sheet using PUT /sheets/v2/spreadsheets/{token}/values"""
    url = f"https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{SPREADSHEET_TOKEN}/values"
    body = {
        "valueRange": {
            "range": f"{SHEET_ID}!{range_str}",
            "values": values
        }
    }
    resp = requests.put(url, headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }, json=body)
    result = resp.json()
    if result.get('code') != 0:
        print(f"ERROR writing {range_str}: {result}")
        return False
    return True
 def main():
    with open('/tmp/sheet_id_results.json') as f:
        data = json.load(f)
    results = data['results']
    results.sort(key=lambda x: x['row_idx'])
    print(f"Writing {len(results)} rows to sheet...")
    # Prepare values: each row is [user_id, status, update_time]
    values = [[r['user_id'], r['status'], r['update_time']] for r in results]
    token = get_token()
    print("Token obtained")
    # Write in batches of 850 at a time (API seems to support large writes)
    # Range: F2:H851 (row 2 to row 1+850)
    range_str = f"F2:H{1+len(values)}"
    print(f"Writing range: {range_str}")
    if write_values(token, range_str, values):
        print(f"SUCCESS: Wrote {len(values)} rows of ID results")
        print(f"Stats: matched={data['stats']['matched']}, unmatched={data['stats']['unmatched']}")
    else:
        print("FAILED")
        sys.exit(1)
 if __name__ == "__main__":
    main()