🤖 每日自动备份 - 2026-05-15 08:00:01
11
MEMORY.md
@ -154,6 +154,17 @@
|
||||
| 41 | 官网 |
|
||||
| 71 | 小程序 |
|
||||
| 其他值 | 站外 |
|
||||
- **付费用户 L1/L2 区分规则(基于 goods_id,[李承龙确认] 2026-05-14):**
|
||||
- **L1 商品:** `goods_id IN (57, 60, 63)` — 瓦拉英语level1 / level1·单季
|
||||
- **L2 商品:** `goods_id IN (31, 32, 33, 54)` — 瓦拉英语level2 / 年包 / 单季度包 / 三季度课包 / 季度包
|
||||
- 注:goods_id=31 历史上名称从「瓦拉英语level2」演进为「瓦拉英语年包」,实际为同一 L2 产品
|
||||
- 注:goods_id=32 历史上名称从「瓦拉英语level2·单季」演进为「瓦拉英语单季度包」,实际为同一 L2 产品
|
||||
- **L1+L2 商品:** `goods_id = 61` — 瓦拉英语level1+2
|
||||
- **用户分类逻辑:** 汇总用户所有订单的 goods_id 后判断:
|
||||
- 仅买过 L1 商品 → 「仅L1」
|
||||
- 仅买过 L2 商品 → 「仅L2」
|
||||
- 买过 L1+L2 商品(goods_id=61),或同时买过 L1 和 L2 商品 → 「L1+L2」
|
||||
- **旧版通用通行券:** `goods_id IN (4, 5, 6, 10, 13, 14, 17, 20, 25, 29, 30, 35, 36, 37, 38)`,量极少(<30单),不区分 L1/L2,建议归入「其他」或通过 `bi_user_course_detail` 反查
|
||||
- **金额单位规则:** `bi_vala_order`表中`pay_amount`字段以元为单位,`pay_amount_int`字段以分为单位;后续统一使用`pay_amount_int`计算销售金额,统计为元时除以100即可
|
||||
- **学习数据统计维度:** 支持按单元/课时/组件维度统计完成人数、平均用时、正确率(Perfect/Good/Oops三个等级)
|
||||
- **特殊时间节点:** `2025-10-01`为核心版本上线时间,部分统计需要区分该节点前后的数据
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
{
|
||||
"version": 1,
|
||||
"updatedAt": "2026-05-13T08:20:55.037Z",
|
||||
"updatedAt": "2026-05-14T06:41:55.506Z",
|
||||
"entries": {
|
||||
"memory:memory/2026-05-06.md:1:20": {
|
||||
"key": "memory:memory/2026-05-06.md:1:20",
|
||||
@ -128,6 +128,68 @@
|
||||
"3月28.5",
|
||||
"4月38.3"
|
||||
]
|
||||
},
|
||||
"memory:memory/2026-05-09.md:1:17": {
|
||||
"key": "memory:memory/2026-05-09.md:1:17",
|
||||
"path": "memory/2026-05-09.md",
|
||||
"startLine": 1,
|
||||
"endLine": 17,
|
||||
"source": "memory",
|
||||
"snippet": "# 2026-05-09 工作日志 ## 王虹茗 - 销售线索用户分析 - **用户:** 王虹茗(user_id: af61e4gc) - **需求:** 用 `lead_user_analysis.py` 脚本处理线索用户 Excel(659条,2026年3月,销售:姜小龙/Bob/Tom/吴迪) - **权限处理:** 王虹茗不在 USER.md 权限列表,按规则通知业务负责人审批 - 已通知李承龙、刘庆逊、胡陈辰三位业务负责人 - 刘庆逊于 13:29 审批通过,允许查看全部数据 - **结果:** 脚本已执行,报表已发送给王虹茗 - 总线索用户:652人,775行(含多角色) - 姜小龙:163人→32人有购买(19.6%),退费5人 - Bob:202人→3人有购买(1.5%),退费1人 - Tom:171人→5人有购买(2.9%),退费2人 - 吴迪:116人→19人有购买(16.4%),退费2人 - 输出文件:`output/销售线索_用户分析.xlsx`",
|
||||
"recallCount": 1,
|
||||
"dailyCount": 0,
|
||||
"groundedCount": 0,
|
||||
"totalScore": 1,
|
||||
"maxScore": 1,
|
||||
"firstRecalledAt": "2026-05-14T06:31:19.437Z",
|
||||
"lastRecalledAt": "2026-05-14T06:31:19.437Z",
|
||||
"queryHashes": [
|
||||
"49e79af44bc3"
|
||||
],
|
||||
"recallDays": [
|
||||
"2026-05-14"
|
||||
],
|
||||
"conceptTags": [
|
||||
"user-id",
|
||||
"lead-user-analysis.py",
|
||||
"姜小龙/bob/tom/吴迪",
|
||||
"user.md",
|
||||
"19.6",
|
||||
"1.5",
|
||||
"2.9",
|
||||
"16.4"
|
||||
]
|
||||
},
|
||||
"memory:memory/2026-05-14.md:1:19": {
|
||||
"key": "memory:memory/2026-05-14.md:1:19",
|
||||
"path": "memory/2026-05-14.md",
|
||||
"startLine": 1,
|
||||
"endLine": 19,
|
||||
"source": "memory",
|
||||
"snippet": "# 2026-05-14 工作日志 ## 李承龙 - 付费用户 L1/L2 区分口径 - **需求:** 区分付费用户属于 L1 还是 L2,根据用户购买的商品做区分 - **分析过程:** - 排查了 `bi_vala_order` 表的 goods_name / goods_id 字段 - 发现部分商品名称不含 level 关键词(年包、单季度包等),但 goods_id 可唯一映射 - goods_id=31 历史上从「瓦拉英语level2」更名为「瓦拉英语年包」 - goods_id=32 从「瓦拉英语level2·单季」更名为「瓦拉英语单季度包」 - 用 `bi_user_course_detail` 验证了映射关系准确 - **最终方案:** 按 goods_id 映射 - L1: goods_id IN (57, 60, 63) - L2: goods_id IN (31, 32, 33, 54) - L1+L2: goods_id = 61 - 旧版通行券(<30单): goods_id IN (4,5,6,10,13,14,17,20,25,29,30,35,36,37,38),归入「其他」 - **用户分类:** 汇总用户所有订单的 goods_id,按购买商品组合判断 - **已更新:** MEMORY.md",
|
||||
"recallCount": 1,
|
||||
"dailyCount": 0,
|
||||
"groundedCount": 0,
|
||||
"totalScore": 1,
|
||||
"maxScore": 1,
|
||||
"firstRecalledAt": "2026-05-14T06:41:55.506Z",
|
||||
"lastRecalledAt": "2026-05-14T06:41:55.506Z",
|
||||
"queryHashes": [
|
||||
"f6fcef2ff061"
|
||||
],
|
||||
"recallDays": [
|
||||
"2026-05-14"
|
||||
],
|
||||
"conceptTags": [
|
||||
"l1/l2",
|
||||
"bi-vala-order",
|
||||
"goods-name",
|
||||
"goods-id",
|
||||
"bi-user-course-detail",
|
||||
"memory.md",
|
||||
"工作",
|
||||
"日志"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
39
memory/2026-05-14.md
Normal file
@ -0,0 +1,39 @@
|
||||
# 2026-05-14 工作日志
|
||||
|
||||
## 李承龙 - 付费用户 L1/L2 区分口径
|
||||
|
||||
- **需求:** 区分付费用户属于 L1 还是 L2,根据用户购买的商品做区分
|
||||
- **分析过程:**
|
||||
- 排查了 `bi_vala_order` 表的 goods_name / goods_id 字段
|
||||
- 发现部分商品名称不含 level 关键词(年包、单季度包等),但 goods_id 可唯一映射
|
||||
- goods_id=31 历史上从「瓦拉英语level2」更名为「瓦拉英语年包」
|
||||
- goods_id=32 从「瓦拉英语level2·单季」更名为「瓦拉英语单季度包」
|
||||
- 用 `bi_user_course_detail` 验证了映射关系准确
|
||||
- **最终方案:** 按 goods_id 映射
|
||||
- L1: goods_id IN (57, 60, 63)
|
||||
- L2: goods_id IN (31, 32, 33, 54)
|
||||
- L1+L2: goods_id = 61
|
||||
- 旧版通行券(<30单): goods_id IN (4,5,6,10,13,14,17,20,25,29,30,35,36,37,38),归入「其他」
|
||||
- **用户分类:** 汇总用户所有订单的 goods_id,按购买商品组合判断
|
||||
- **已更新:** MEMORY.md
|
||||
|
||||
## 课消指标 v2(剔除U0序章)
|
||||
- **L1 U0**: chapter_id IN (343,344,345,346,348)
|
||||
- **L2 U0**: chapter_id IN (55,56,57,58,59)
|
||||
- **剔除后结果(截至5/10):**
|
||||
- 仅L1: 付费192/有消132/无消60(31%)/人均2.53/有消人均3.67
|
||||
- 仅L2: 付费1370/有消461/无消909(66%)/人均1.18/有消人均3.49
|
||||
- L1+L2: 付费1207/有消660/无消547(45%)/人均2.37/有消人均4.34
|
||||
- **4张独立图表已生成至 output/**
|
||||
|
||||
## 李承龙 - 课消口径调整:L1/L2按付费群重新分类
|
||||
|
||||
- **[李承龙确认]** L1付费用户 = 仅L1 + L1+L2,L2付费用户 = 仅L2 + L1+L2(L1+L2用户在两张图中均有计入)
|
||||
- **重新生成 Excel v3** (`output/course_consumption_by_level_v3.xlsx`):4个Sheet(概览/每周明细/L1图表/L2图表)
|
||||
- **重新生成 4张独立PNG图表** (`output/L1_all_users_stack.png`, `L1_all_avg_trend.png`, `L2_all_users_stack.png`, `L2_all_avg_trend.png`)
|
||||
- **最终数据(截至最后一周,剔除U0序章):**
|
||||
- L1付费群: 1,399人 | 有消738 | 无消661(43%) | 人均1.97 | 有消人均3.73
|
||||
- L2付费群: 2,577人 | 有消1,126 | 无消1,451(56%) | 人均1.51 | 有消人均3.46
|
||||
- 合计(去重): 2,769人
|
||||
- **关键发现:** L1+L2用户(1,207人)注入后,L1无消率从31%升至43%,L2从66%降至56%
|
||||
- **脚本:** `scripts/course_excel_v3.py`, `scripts/generate_charts_v3.py`
|
||||
BIN
output/L1_all_avg_trend.png
Normal file
|
After Width: | Height: | Size: 201 KiB |
BIN
output/L1_all_users_stack.png
Normal file
|
After Width: | Height: | Size: 80 KiB |
BIN
output/L1_avg_trend.png
Normal file
|
After Width: | Height: | Size: 124 KiB |
BIN
output/L1_avg_trend_v4.png
Normal file
|
After Width: | Height: | Size: 110 KiB |
BIN
output/L1_users_stack.png
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
output/L1_users_stack_v4.png
Normal file
|
After Width: | Height: | Size: 82 KiB |
BIN
output/L2_all_avg_trend.png
Normal file
|
After Width: | Height: | Size: 174 KiB |
BIN
output/L2_all_users_stack.png
Normal file
|
After Width: | Height: | Size: 81 KiB |
BIN
output/L2_avg_trend.png
Normal file
|
After Width: | Height: | Size: 168 KiB |
BIN
output/L2_avg_trend_v4.png
Normal file
|
After Width: | Height: | Size: 170 KiB |
BIN
output/L2_users_stack.png
Normal file
|
After Width: | Height: | Size: 82 KiB |
BIN
output/L2_users_stack_v4.png
Normal file
|
After Width: | Height: | Size: 83 KiB |
1
output/course_data_v4.json
Normal file
87
scripts/charts_v4.py
Normal file
@ -0,0 +1,87 @@
|
||||
#!/usr/bin/env python3
|
||||
"""图表 v4: L1只看L1课程, L2只看L2课程"""
|
||||
import json, os
|
||||
from datetime import date, timedelta
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
import matplotlib.pyplot as plt
|
||||
import matplotlib.dates as mdates
|
||||
import matplotlib.font_manager as fm
|
||||
import numpy as np
|
||||
|
||||
fm.fontManager.addfont('/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc')
|
||||
plt.rcParams['font.family'] = fm.FontProperties(fname='/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc').get_name()
|
||||
plt.rcParams['axes.unicode_minus'] = False
|
||||
|
||||
with open('/root/.openclaw/workspace/output/course_data_v4.json') as f:
|
||||
data = json.load(f)
|
||||
results = data['results']
|
||||
|
||||
out = '/root/.openclaw/workspace/output'
|
||||
|
||||
configs = {
|
||||
'L1': {'prefix': 'L1', 'color': '#4A90D9', 'light': '#A8CFF1', 'label': 'L1'},
|
||||
'L2': {'prefix': 'L2', 'color': '#E85D47', 'light': '#F4A9A0', 'label': 'L2'},
|
||||
}
|
||||
|
||||
for key, cfg in configs.items():
|
||||
pfx = cfg['prefix']; color = cfg['color']; light = cfg['light']; label = cfg['label']
|
||||
first = next(i for i, r in enumerate(results) if r[f'{pfx}_paid'] > 0)
|
||||
data_sub = results[first:]
|
||||
|
||||
dates = [date.fromisoformat(r['ws']) for r in data_sub]
|
||||
xs = [d + timedelta(days=3) for d in dates]
|
||||
paid = [r[f'{pfx}_paid'] for r in data_sub]
|
||||
cons_users = [r[f'{pfx}_cons_users'] for r in data_sub]
|
||||
no_cons = [r[f'{pfx}_no_cons'] for r in data_sub]
|
||||
avg_all = [r[f'{pfx}_avg_all'] for r in data_sub]
|
||||
avg_cons = [r[f'{pfx}_avg_cons'] for r in data_sub]
|
||||
|
||||
# 图1: 堆叠柱状
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
x_idx = np.arange(len(xs))
|
||||
ax.bar(x_idx, cons_users, 0.65, color=light, label='有课消用户', zorder=3)
|
||||
ax.bar(x_idx, no_cons, 0.65, bottom=cons_users, color='#D0D0D0', label='无课消用户', zorder=3)
|
||||
step = max(1, len(data_sub)//10)
|
||||
for i in range(0, len(data_sub), step):
|
||||
ax.annotate(str(paid[i]), (i, paid[i]), textcoords='offset points', xytext=(0, 5),
|
||||
fontsize=7.5, ha='center', color='#333333', fontweight='bold')
|
||||
ax.set_xticks(x_idx[::step])
|
||||
ax.set_xticklabels([dates[i].strftime('%m/%d') for i in range(0, len(data_sub), step)], fontsize=8.5, rotation=45)
|
||||
ax.set_ylabel('用户数', fontsize=13)
|
||||
ax.set_title(f'{label}付费用户周课消分布(只看{label}课程,剔除U0)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(axis='y', alpha=0.3, zorder=0)
|
||||
ax.set_xlim(-0.5, len(x_idx)-0.5)
|
||||
no_rate = no_cons[-1]/paid[-1]*100 if paid[-1] else 0
|
||||
ax.text(0.97, 0.95, f'付费{paid[-1]}人 | 无课消率{no_rate:.0f}%',
|
||||
transform=ax.transAxes, fontsize=11, ha='right', va='top', color='#666666', fontstyle='italic')
|
||||
plt.tight_layout()
|
||||
plt.savefig(f'{out}/{pfx}_users_stack_v4.png', dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {pfx}_users_stack_v4.png')
|
||||
|
||||
# 图2: 折线
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
ax.plot(xs, avg_all, 'o-', color='#999999', linewidth=2.2, markersize=5, label='人均课消(全部付费用户)', markerfacecolor='white')
|
||||
ax.plot(xs, avg_cons, 's-', color=color, linewidth=2.8, markersize=5, label='人均课消(有课消用户)', markerfacecolor='white')
|
||||
ax.fill_between(xs, avg_all, avg_cons, alpha=0.08, color=color)
|
||||
for i in range(0, len(data_sub), max(1, len(data_sub)//8)):
|
||||
ax.annotate(f'{avg_all[i]:.1f}', (xs[i], avg_all[i]), textcoords='offset points',
|
||||
xytext=(0,-15), fontsize=7.5, color='#999999', ha='center')
|
||||
ax.annotate(f'{avg_cons[i]:.1f}', (xs[i], avg_cons[i]), textcoords='offset points',
|
||||
xytext=(0,7), fontsize=7.5, color=color, ha='center', fontweight='bold')
|
||||
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))
|
||||
ax.xaxis.set_major_locator(mdates.MonthLocator())
|
||||
plt.setp(ax.xaxis.get_majorticklabels(), rotation=45, fontsize=9)
|
||||
ax.set_ylabel('课消数(节/周)', fontsize=13)
|
||||
ax.set_title(f'{label}付费用户周人均课消趋势(只看{label}课程,剔除U0)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(True, alpha=0.3)
|
||||
ax.set_xlim(date(2025,8,30), date(2026,5,12))
|
||||
plt.tight_layout()
|
||||
plt.savefig(f'{out}/{pfx}_avg_trend_v4.png', dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {pfx}_avg_trend_v4.png')
|
||||
|
||||
print('\n✅ 4张v4图表已生成')
|
||||
165
scripts/course_analysis_v4.py
Normal file
@ -0,0 +1,165 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
v4: L1付费群课消只看L1课程,L2付费群课消只看L2课程
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# 获取L1/L2有效章节(剔除U0)
|
||||
cur.execute("SELECT id FROM bi_level_unit_lesson WHERE course_level='L1'")
|
||||
l1_chapters = set(r[0] for r in cur.fetchall())
|
||||
cur.execute("SELECT id FROM bi_level_unit_lesson WHERE course_level='L2'")
|
||||
l2_chapters = set(r[0] for r in cur.fetchall())
|
||||
u0 = {55, 56, 57, 58, 59, 343, 344, 345, 346, 348}
|
||||
l1_chapters -= u0
|
||||
l2_chapters -= u0
|
||||
print(f"L1章节: {len(l1_chapters)} | L2章节: {len(l2_chapters)}")
|
||||
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
we = d + timedelta(days=6 - d.weekday())
|
||||
if we >= overall_end: we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
print("分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_levels = defaultdict(set)
|
||||
user_orders = defaultdict(list)
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_levels[aid].add(lt)
|
||||
user_orders[aid].append((pay_date.date(), is_refunded))
|
||||
|
||||
def is_paid(aid, as_of):
|
||||
return sum(1 for pd, ref in user_orders[aid] if pd <= as_of and not ref) > 0
|
||||
|
||||
l1_pool = {aid for aid, lv in user_levels.items() if 'L1' in lv or 'L1+L2' in lv}
|
||||
l2_pool = {aid for aid, lv in user_levels.items() if 'L2' in lv or 'L1+L2' in lv}
|
||||
all_pool = l1_pool | l2_pool
|
||||
print(f"L1池: {len(l1_pool)}, L2池: {len(l2_pool)}, 合计: {len(all_pool)}")
|
||||
|
||||
print("查询课消...")
|
||||
cons_map = {}
|
||||
for ti in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{ti}"
|
||||
cur.execute(f"""SELECT user_id, chapter_id, updated_at FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'""")
|
||||
for uid, cid, ua in cur.fetchall():
|
||||
if cid in u0: continue
|
||||
# 只保留L1或L2课程
|
||||
if cid not in l1_chapters and cid not in l2_chapters: continue
|
||||
key = (uid, cid)
|
||||
d = ua.date() if hasattr(ua, 'date') else datetime.strptime(str(ua)[:10], '%Y-%m-%d').date()
|
||||
if key not in cons_map or d < cons_map[key]:
|
||||
cons_map[key] = d
|
||||
|
||||
print("角色映射...")
|
||||
all_uids = list(set(k[0] for k in cons_map))
|
||||
char2acct = {}
|
||||
for i in range(0, len(all_uids), 500):
|
||||
batch = all_uids[i:i+500]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall(): char2acct[cid] = aid
|
||||
|
||||
print("按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
l1_paid = {aid for aid in l1_pool if is_paid(aid, we)}
|
||||
l2_paid = {aid for aid in l2_pool if is_paid(aid, we)}
|
||||
t_paid = {aid for aid in all_pool if is_paid(aid, we)}
|
||||
|
||||
l1_cons, l1_cu = 0, set()
|
||||
l2_cons, l2_cu = 0, set()
|
||||
t_cons, t_cu = 0, set()
|
||||
|
||||
for (uid, ch_id), cons_date in cons_map.items():
|
||||
if not (ws <= cons_date <= we): continue
|
||||
aid = char2acct.get(uid)
|
||||
if not aid: continue
|
||||
|
||||
# L1付费群 且 是L1课程
|
||||
if aid in l1_paid and ch_id in l1_chapters:
|
||||
l1_cons += 1
|
||||
l1_cu.add(aid)
|
||||
# L2付费群 且 是L2课程
|
||||
if aid in l2_paid and ch_id in l2_chapters:
|
||||
l2_cons += 1
|
||||
l2_cu.add(aid)
|
||||
# 合计:付费用户在对应级别课程上的课消
|
||||
if aid in t_paid:
|
||||
if (aid in l1_paid and ch_id in l1_chapters) or (aid in l2_paid and ch_id in l2_chapters):
|
||||
t_cons += 1
|
||||
t_cu.add(aid)
|
||||
|
||||
results.append({
|
||||
'ws': ws, 'we': we,
|
||||
'L1_paid': len(l1_paid), 'L1_cons': l1_cons, 'L1_cons_users': len(l1_cu),
|
||||
'L1_no_cons': len(l1_paid) - len(l1_cu),
|
||||
'L1_avg_all': round(l1_cons / len(l1_paid), 2) if l1_paid else 0,
|
||||
'L1_avg_cons': round(l1_cons / len(l1_cu), 2) if l1_cu else 0,
|
||||
'L2_paid': len(l2_paid), 'L2_cons': l2_cons, 'L2_cons_users': len(l2_cu),
|
||||
'L2_no_cons': len(l2_paid) - len(l2_cu),
|
||||
'L2_avg_all': round(l2_cons / len(l2_paid), 2) if l2_paid else 0,
|
||||
'L2_avg_cons': round(l2_cons / len(l2_cu), 2) if l2_cu else 0,
|
||||
'total_paid': len(t_paid), 'total_cons': t_cons, 'total_cons_users': len(t_cu),
|
||||
'total_no_cons': len(t_paid) - len(t_cu),
|
||||
'total_avg_all': round(t_cons / len(t_paid), 2) if t_paid else 0,
|
||||
'total_avg_cons': round(t_cons / len(t_cu), 2) if t_cu else 0,
|
||||
})
|
||||
|
||||
r = results[-1]
|
||||
if (len(results) - 1) % 8 == 0 or len(results) == len(weeks):
|
||||
print(f" W{len(results):2d} {ws}~{we} | L1:{r['L1_paid']}有消{r['L1_cons_users']} | L2:{r['L2_paid']}有消{r['L2_cons_users']}")
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# 打印最终结果
|
||||
last = results[-1]
|
||||
print(f"\n=== 最终数据(v4:L1只看L1课程, L2只看L2课程)===")
|
||||
print(f"L1付费群: {last['L1_paid']}人 | 有消{last['L1_cons_users']} | 无消{last['L1_no_cons']}({last['L1_no_cons']/last['L1_paid']*100:.0f}%) | 人均{last['L1_avg_all']} | 有消人均{last['L1_avg_cons']}")
|
||||
print(f"L2付费群: {last['L2_paid']}人 | 有消{last['L2_cons_users']} | 无消{last['L2_no_cons']}({last['L2_no_cons']/last['L2_paid']*100:.0f}%) | 人均{last['L2_avg_all']} | 有消人均{last['L2_avg_cons']}")
|
||||
print(f"合计(去重): {last['total_paid']}人 | 有消{last['total_cons_users']} | 无消{last['total_no_cons']}({last['total_no_cons']/last['total_paid']*100:.0f}%)")
|
||||
|
||||
# 保存数据到 JSON 供后续图表脚本使用
|
||||
import json
|
||||
out = '/root/.openclaw/workspace/output/course_data_v4.json'
|
||||
serializable = []
|
||||
for r in results:
|
||||
d = {}
|
||||
for k, v in r.items():
|
||||
if isinstance(v, date): d[k] = v.isoformat()
|
||||
else: d[k] = v
|
||||
serializable.append(d)
|
||||
with open(out, 'w') as f:
|
||||
json.dump({'results': serializable, 'L1_chapters': list(l1_chapters), 'L2_chapters': list(l2_chapters)}, f, ensure_ascii=False)
|
||||
print(f"\n数据已保存: {out}")
|
||||
@ -1,167 +1,191 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
课消指标:按 L1/L2 分等级统计
|
||||
课消指标:按周统计 2025-09-01 ~ 2026-05-10,按 L1/L2/L1+L2 拆分
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import date, timedelta, datetime
|
||||
from datetime import datetime, timedelta, date
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591,
|
||||
user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*",
|
||||
dbname="vala_bi"
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# ===== 时间参数 =====
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
# 生成周列表
|
||||
# 生成周列表(周一~周日)
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
we = d + timedelta(days=6 - d.weekday())
|
||||
days_to_sunday = 6 - d.weekday()
|
||||
we = d + timedelta(days=days_to_sunday)
|
||||
if we >= overall_end:
|
||||
we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
# ===== 获取 L1/L2 chapter_id =====
|
||||
u0_ids = {343, 344, 345, 346, 348, 55, 56, 57, 58, 59}
|
||||
cur.execute("SELECT DISTINCT id, course_level FROM bi_level_unit_lesson WHERE course_level IN ('L1','L2')")
|
||||
l1_chapters = set()
|
||||
l2_chapters = set()
|
||||
for cid, lv in cur.fetchall():
|
||||
if cid in u0_ids:
|
||||
continue
|
||||
if lv == 'L1':
|
||||
l1_chapters.add(cid)
|
||||
elif lv == 'L2':
|
||||
l2_chapters.add(cid)
|
||||
print(f"统计区间: {overall_start} ~ {overall_end - timedelta(days=1)}, 共 {len(weeks)} 周")
|
||||
|
||||
print(f"L1 chapters: {len(l1_chapters)}, L2 chapters: {len(l2_chapters)}")
|
||||
|
||||
# ===== Step 1: 付费用户 =====
|
||||
print("Step 1: 查找付费用户...")
|
||||
# ===== Step 1: 用户 L1/L2 分类 + 付费状态 =====
|
||||
print("\nStep 1: 分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT DISTINCT o.account_id
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE
|
||||
WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他'
|
||||
END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
GROUP BY o.account_id
|
||||
HAVING COUNT(CASE WHEN o.order_status != 4
|
||||
OR (o.order_status = 4 AND o.trade_no NOT IN (
|
||||
SELECT trade_no FROM bi_refund_order WHERE status=3
|
||||
)) THEN 1 END) > 0
|
||||
""")
|
||||
paid_account_ids = [row[0] for row in cur.fetchall()]
|
||||
print(f" 付费用户: {len(paid_account_ids)}")
|
||||
|
||||
# 订单详情用于动态判断每周付费用户
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.out_trade_no, o.pay_success_date, o.order_status
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status=1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
AND o.pay_success_date >= '2025-01-01'
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL
|
||||
AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
cur.execute("SELECT trade_no, status FROM bi_refund_order WHERE status=3")
|
||||
refund_set = {r[0] for r in cur.fetchall() if r[0]}
|
||||
print(f" 订单数: {len(orders)}")
|
||||
|
||||
account_orders = defaultdict(list)
|
||||
for aid, tn, otn, psd, os in orders:
|
||||
is_ref = os == 4 and tn in refund_set
|
||||
account_orders[aid].append((psd, is_ref))
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
def is_paid(aid, as_of):
|
||||
return sum(1 for pd, ref in account_orders.get(aid, []) if pd.date() <= as_of and not ref) > 0
|
||||
# {account_id: {'levels': set, 'orders': [(pay_date, is_refunded, level), ...]}}
|
||||
user_data = defaultdict(lambda: {'levels': set(), 'orders': []})
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_data[aid]['levels'].add(lt)
|
||||
user_data[aid]['orders'].append((pay_date.date(), is_refunded, lt))
|
||||
|
||||
# ===== Step 2: 课消(分L1/L2)=====
|
||||
print("Step 2: 查询课消(分L1/L2)...")
|
||||
l1_consumption = {} # (user_id, chapter_id) -> earliest date
|
||||
l2_consumption = {}
|
||||
# 确定每位用户的 L1/L2 分类
|
||||
def classify_user(levels):
|
||||
has_l1 = 'L1' in levels
|
||||
has_l2 = 'L2' in levels
|
||||
has_l1l2 = 'L1+L2' in levels
|
||||
if has_l1l2 or (has_l1 and has_l2):
|
||||
return 'L1+L2'
|
||||
elif has_l1:
|
||||
return '仅L1'
|
||||
elif has_l2:
|
||||
return '仅L2'
|
||||
return '其他'
|
||||
|
||||
for t in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{t}"
|
||||
for aid in user_data:
|
||||
user_data[aid]['category'] = classify_user(user_data[aid]['levels'])
|
||||
|
||||
# 统计各类用户数
|
||||
cats = defaultdict(int)
|
||||
for aid, d in user_data.items():
|
||||
cats[d['category']] += 1
|
||||
print(f" 仅L1: {cats['仅L1']}, 仅L2: {cats['仅L2']}, L1+L2: {cats['L1+L2']}, 其他: {cats['其他']}")
|
||||
|
||||
# 判断某用户截至某日是否为付费用户
|
||||
def is_paid_as_of(aid, as_of_date):
|
||||
d = user_data[aid]
|
||||
unpaid = sum(1 for pd, ref, lt in d['orders'] if pd <= as_of_date and not ref)
|
||||
return unpaid > 0
|
||||
|
||||
# ===== Step 2: 课消记录 =====
|
||||
print("\nStep 2: 查询课消...")
|
||||
consumption_map = {} # (user_id, chapter_id) -> earliest date
|
||||
|
||||
for table_idx in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{table_idx}"
|
||||
cur.execute(f"""
|
||||
SELECT user_id, chapter_id, updated_at FROM {tbl}
|
||||
WHERE play_status=1 AND updated_at>='2025-09-01' AND updated_at<'2026-05-11'
|
||||
SELECT user_id, chapter_id, updated_at
|
||||
FROM {tbl}
|
||||
WHERE play_status = 1
|
||||
AND updated_at >= '2025-09-01'
|
||||
AND updated_at < '2026-05-11'
|
||||
""")
|
||||
for uid, cid, upd in cur.fetchall():
|
||||
if cid in l1_chapters:
|
||||
k, m = (uid, cid), l1_consumption
|
||||
elif cid in l2_chapters:
|
||||
k, m = (uid, cid), l2_consumption
|
||||
else:
|
||||
continue
|
||||
d = upd.date() if hasattr(upd, 'date') else upd
|
||||
if k not in m or d < m[k]:
|
||||
m[k] = d
|
||||
cnt = 0
|
||||
for user_id, chapter_id, updated_at in cur.fetchall():
|
||||
key = (user_id, chapter_id)
|
||||
d = updated_at.date() if hasattr(updated_at, 'date') else datetime.strptime(str(updated_at)[:10], '%Y-%m-%d').date()
|
||||
if key not in consumption_map or d < consumption_map[key]:
|
||||
consumption_map[key] = d
|
||||
cnt += 1
|
||||
print(f" {tbl}: {cnt} 条")
|
||||
print(f" 去重后: {len(consumption_map)} 条")
|
||||
|
||||
print(f" L1 课消(去重): {len(l1_consumption)}")
|
||||
print(f" L2 课消(去重): {len(l2_consumption)}")
|
||||
|
||||
# ===== Step 3: 角色映射 =====
|
||||
print("Step 3: 关联角色...")
|
||||
all_uids = set(k[0] for k in l1_consumption) | set(k[0] for k in l2_consumption)
|
||||
char_to_account = {}
|
||||
for i in range(0, len(all_uids), 500):
|
||||
batch = list(all_uids)[i:i+500]
|
||||
# ===== Step 3: character -> account =====
|
||||
print("\nStep 3: 角色映射...")
|
||||
all_uids = list(set(k[0] for k in consumption_map))
|
||||
char2acct = {}
|
||||
bs = 500
|
||||
for i in range(0, len(all_uids), bs):
|
||||
batch = all_uids[i:i+bs]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall():
|
||||
char_to_account[cid] = aid
|
||||
char2acct[cid] = aid
|
||||
print(f" 映射: {len(char2acct)}")
|
||||
|
||||
# ===== Step 4: 按周汇总 =====
|
||||
print("Step 4: 按周汇总...")
|
||||
# ===== Step 4: 按周 + 按分类汇总 =====
|
||||
print("\nStep 4: 按周汇总...\n")
|
||||
|
||||
def weekly_stats(consumption_map):
|
||||
"""返回每周的 (课消次数, 有消用户数)"""
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
cons = 0
|
||||
users = set()
|
||||
for (uid, ch_id), d in consumption_map.items():
|
||||
if ws <= d <= we:
|
||||
cons += 1
|
||||
aid = char_to_account.get(uid)
|
||||
# 截至 we 的付费用户(按分类)
|
||||
paid_by_cat = defaultdict(set)
|
||||
for aid in user_data:
|
||||
if is_paid_as_of(aid, we):
|
||||
cat = user_data[aid]['category']
|
||||
paid_by_cat[cat].add(aid)
|
||||
|
||||
# 该周课消(付费用户)
|
||||
cons_by_cat = defaultdict(int)
|
||||
cons_users_by_cat = defaultdict(set)
|
||||
|
||||
for (uid, ch_id), cons_date in consumption_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if aid:
|
||||
users.add(aid)
|
||||
results.append((ws, we, cons, len(users)))
|
||||
return results
|
||||
cat = user_data.get(aid, {}).get('category', '其他')
|
||||
if aid in paid_by_cat.get(cat, set()):
|
||||
cons_by_cat[cat] += 1
|
||||
cons_users_by_cat[cat].add(aid)
|
||||
|
||||
l1_stats = weekly_stats(l1_consumption)
|
||||
l2_stats = weekly_stats(l2_consumption)
|
||||
week_label = f"{ws.strftime('%m/%d')}-{we.strftime('%m/%d')}"
|
||||
row = {'week': week_label, 'ws': ws, 'we': we}
|
||||
|
||||
# 汇总 + 付费用户
|
||||
results = []
|
||||
for i, (ws, we) in enumerate(weeks):
|
||||
paid = set(aid for aid in account_orders if is_paid(aid, we))
|
||||
n_paid = len(paid)
|
||||
for cat in ['仅L1', '仅L2', 'L1+L2', '其他', '合计']:
|
||||
if cat == '合计':
|
||||
n_paid = sum(len(v) for v in paid_by_cat.values())
|
||||
n_cons = sum(cons_by_cat.values())
|
||||
n_cons_users = len(set.union(*cons_users_by_cat.values())) if cons_users_by_cat else 0
|
||||
else:
|
||||
n_paid = len(paid_by_cat.get(cat, set()))
|
||||
n_cons = cons_by_cat.get(cat, 0)
|
||||
n_cons_users = len(cons_users_by_cat.get(cat, set()))
|
||||
|
||||
l1_cons, l1_users = l1_stats[i][2], l1_stats[i][3]
|
||||
l2_cons, l2_users = l2_stats[i][2], l2_stats[i][3]
|
||||
avg_all = n_cons / n_paid if n_paid > 0 else 0
|
||||
avg_cons = n_cons / n_cons_users if n_cons_users > 0 else 0
|
||||
|
||||
l1_avg = l1_cons / n_paid if n_paid else 0
|
||||
l1_act_avg = l1_cons / l1_users if l1_users else 0
|
||||
l2_avg = l2_cons / n_paid if n_paid else 0
|
||||
l2_act_avg = l2_cons / l2_users if l2_users else 0
|
||||
row[f'{cat}_paid'] = n_paid
|
||||
row[f'{cat}_cons'] = n_cons
|
||||
row[f'{cat}_users'] = n_cons_users
|
||||
row[f'{cat}_avg_all'] = avg_all
|
||||
row[f'{cat}_avg_cons'] = avg_cons
|
||||
|
||||
results.append({
|
||||
'week': f"{ws.strftime('%m/%d')}-{we.strftime('%m/%d')}",
|
||||
'ws': ws, 'we': we, 'paid': n_paid,
|
||||
'l1_cons': l1_cons, 'l1_users': l1_users, 'l1_avg': l1_avg, 'l1_act': l1_act_avg,
|
||||
'l2_cons': l2_cons, 'l2_users': l2_users, 'l2_avg': l2_avg, 'l2_act': l2_act_avg,
|
||||
})
|
||||
results.append(row)
|
||||
print(f" {week_label} | 合计:付费{row['合计_paid']} 课消{row['合计_cons']} "
|
||||
f"人均{row['合计_avg_all']:.2f} | "
|
||||
f"L1:{row['仅L1_avg_all']:.2f} L2:{row['仅L2_avg_all']:.2f} L1+L2:{row['L1+L2_avg_all']:.2f}")
|
||||
|
||||
# ===== 输出完整表 =====
|
||||
print("\n" + "="*120)
|
||||
header = f"{'周':<12} {'合计付费':>6} {'合计课消':>7} {'合计人均':>7} | {'L1付费':>6} {'L1课消':>6} {'L1人均':>6} {'L1有消人均':>7} | {'L2付费':>6} {'L2课消':>6} {'L2人均':>6} {'L2有消人均':>7} | {'L1L2付费':>7} {'L1L2课消':>7} {'L1L2人均':>7} {'L1L2有消人均':>8}"
|
||||
print(header)
|
||||
print("-"*120)
|
||||
|
||||
# 输出
|
||||
print(f"\n{'周':<16} {'付费':>6} {'L1课消':>7} {'L1有消':>7} {'L1人均':>7} {'L1有消人均':>9} {'L2课消':>7} {'L2有消':>7} {'L2人均':>7} {'L2有消人均':>9}")
|
||||
for r in results:
|
||||
print(f"{r['week']:<16} {r['paid']:>6} {r['l1_cons']:>7} {r['l1_users']:>7} {r['l1_avg']:>7.2f} {r['l1_act']:>9.2f} {r['l2_cons']:>7} {r['l2_users']:>7} {r['l2_avg']:>7.2f} {r['l2_act']:>9.2f}")
|
||||
print(f"{r['week']:<12} {r['合计_paid']:>6} {r['合计_cons']:>7} {r['合计_avg_all']:>7.2f} | "
|
||||
f"{r['仅L1_paid']:>6} {r['仅L1_cons']:>6} {r['仅L1_avg_all']:>6.2f} {r['仅L1_avg_cons']:>7.2f} | "
|
||||
f"{r['仅L2_paid']:>6} {r['仅L2_cons']:>6} {r['仅L2_avg_all']:>6.2f} {r['仅L2_avg_cons']:>7.2f} | "
|
||||
f"{r['L1+L2_paid']:>7} {r['L1+L2_cons']:>7} {r['L1+L2_avg_all']:>7.2f} {r['L1+L2_avg_cons']:>8.2f}")
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
395
scripts/course_consumption_v2.py
Normal file
@ -0,0 +1,395 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
课消指标 v2:剔除 U0 序章,4张图按 L1/L2 拆分
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
import openpyxl
|
||||
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
|
||||
from openpyxl.chart import LineChart, BarChart, Reference
|
||||
from openpyxl.chart.series import DataPoint
|
||||
from openpyxl.chart.label import DataLabelList
|
||||
from openpyxl.utils import get_column_letter
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# ===== U0 chapter_ids to exclude =====
|
||||
u0_chapters = {55, 56, 57, 58, 59, 343, 344, 345, 346, 348}
|
||||
print(f"剔除 U0 序章: {sorted(u0_chapters)}")
|
||||
|
||||
# ===== 时间参数 =====
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
days_to_sunday = 6 - d.weekday()
|
||||
we = d + timedelta(days=days_to_sunday)
|
||||
if we >= overall_end:
|
||||
we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
# ===== Step 1: 用户分类 =====
|
||||
print("\nStep 1: 分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_data = defaultdict(lambda: {'levels': set(), 'orders': []})
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_data[aid]['levels'].add(lt)
|
||||
user_data[aid]['orders'].append((pay_date.date(), is_refunded, lt))
|
||||
|
||||
def classify_user(levels):
|
||||
has_l1, has_l2 = 'L1' in levels, 'L2' in levels
|
||||
return 'L1+L2' if ('L1+L2' in levels or (has_l1 and has_l2)) else ('仅L1' if has_l1 else ('仅L2' if has_l2 else '其他'))
|
||||
|
||||
for aid in user_data:
|
||||
user_data[aid]['category'] = classify_user(user_data[aid]['levels'])
|
||||
|
||||
def is_paid_as_of(aid, as_of_date):
|
||||
return sum(1 for pd, ref, lt in user_data[aid]['orders'] if pd <= as_of_date and not ref) > 0
|
||||
|
||||
# ===== Step 2: 课消 (剔除 U0) =====
|
||||
print("\nStep 2: 查询课消(剔除U0)...")
|
||||
consumption_map = {}
|
||||
u0_skipped = 0
|
||||
for table_idx in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{table_idx}"
|
||||
cur.execute(f"""
|
||||
SELECT user_id, chapter_id, updated_at
|
||||
FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'
|
||||
""")
|
||||
for user_id, chapter_id, updated_at in cur.fetchall():
|
||||
if chapter_id in u0_chapters:
|
||||
u0_skipped += 1
|
||||
continue
|
||||
key = (user_id, chapter_id)
|
||||
d = updated_at.date() if hasattr(updated_at, 'date') else datetime.strptime(str(updated_at)[:10], '%Y-%m-%d').date()
|
||||
if key not in consumption_map or d < consumption_map[key]:
|
||||
consumption_map[key] = d
|
||||
|
||||
print(f" 剔除U0课消: {u0_skipped} 条, 去重后: {len(consumption_map)} 条")
|
||||
|
||||
# ===== Step 3: 角色映射 =====
|
||||
print("Step 3: 角色映射...")
|
||||
all_uids = list(set(k[0] for k in consumption_map))
|
||||
char2acct = {}
|
||||
bs = 500
|
||||
for i in range(0, len(all_uids), bs):
|
||||
batch = all_uids[i:i+bs]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall():
|
||||
char2acct[cid] = aid
|
||||
print(f" 映射: {len(char2acct)}")
|
||||
|
||||
# ===== Step 4: 按周汇总 =====
|
||||
print("Step 4: 按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
paid_by_cat = defaultdict(set)
|
||||
for aid in user_data:
|
||||
if is_paid_as_of(aid, we):
|
||||
paid_by_cat[user_data[aid]['category']].add(aid)
|
||||
|
||||
cons_by_cat = defaultdict(int)
|
||||
cons_users_by_cat = defaultdict(set)
|
||||
|
||||
for (uid, ch_id), cons_date in consumption_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if aid:
|
||||
cat = user_data.get(aid, {}).get('category', '其他')
|
||||
if aid in paid_by_cat.get(cat, set()):
|
||||
cons_by_cat[cat] += 1
|
||||
cons_users_by_cat[cat].add(aid)
|
||||
|
||||
row = {'ws': ws, 'we': we}
|
||||
for cat in ['仅L1', '仅L2', 'L1+L2', '其他', '合计']:
|
||||
if cat == '合计':
|
||||
n_paid = sum(len(v) for v in paid_by_cat.values())
|
||||
n_cons = sum(cons_by_cat.values())
|
||||
n_cons_users = len(set.union(*cons_users_by_cat.values())) if cons_users_by_cat else 0
|
||||
else:
|
||||
n_paid = len(paid_by_cat.get(cat, set()))
|
||||
n_cons = cons_by_cat.get(cat, 0)
|
||||
n_cons_users = len(cons_users_by_cat.get(cat, set()))
|
||||
|
||||
row[f'{cat}_paid'] = n_paid
|
||||
row[f'{cat}_cons'] = n_cons
|
||||
row[f'{cat}_cons_users'] = n_cons_users
|
||||
row[f'{cat}_no_cons'] = n_paid - n_cons_users
|
||||
row[f'{cat}_avg_all'] = round(n_cons / n_paid, 2) if n_paid > 0 else 0
|
||||
row[f'{cat}_avg_cons'] = round(n_cons / n_cons_users, 2) if n_cons_users > 0 else 0
|
||||
|
||||
results.append(row)
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# ===== 过滤: 仅保留有足够数据的周(付费人数>0)=====
|
||||
for cat in ['仅L1', '仅L2', 'L1+L2']:
|
||||
# 找到第一个付费>0的周
|
||||
first_idx = next((i for i, r in enumerate(results) if r[f'{cat}_paid'] > 0), 0)
|
||||
print(f"{cat} 数据起于第 {first_idx+1} 周 ({results[first_idx]['ws']})")
|
||||
|
||||
# ===== 生成 Excel =====
|
||||
print("\n生成 Excel...")
|
||||
wb = openpyxl.Workbook()
|
||||
wb.remove(wb.active)
|
||||
|
||||
# 样式
|
||||
header_font = Font(name='微软雅黑', bold=True, size=9, color='FFFFFF')
|
||||
header_fill = PatternFill(start_color='2F5496', end_color='2F5496', fill_type='solid')
|
||||
data_font = Font(name='微软雅黑', size=9)
|
||||
title_font = Font(name='微软雅黑', bold=True, size=14, color='2F5496')
|
||||
subtitle_font = Font(name='微软雅黑', bold=True, size=11, color='2F5496')
|
||||
border = Border(left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin'))
|
||||
center = Alignment(horizontal='center', vertical='center')
|
||||
|
||||
l1_color = '4A90D9'
|
||||
l2_color = 'E85D47'
|
||||
l1l2_color = '7B9E4B'
|
||||
|
||||
def apply_cell(ws, row, col, value, font=data_font, fill=None, align=center, border_style=border):
|
||||
c = ws.cell(row=row, column=col, value=value)
|
||||
c.font, c.border, c.alignment = font, border_style, align
|
||||
if fill: c.fill = fill
|
||||
return c
|
||||
|
||||
def apply_header(ws, row, col, value):
|
||||
c = ws.cell(row=row, column=col, value=value)
|
||||
c.font, c.fill, c.border, c.alignment = header_font, header_fill, border, center
|
||||
return c
|
||||
|
||||
# ===== Sheet 1: 概览 =====
|
||||
ws1 = wb.create_sheet("概览")
|
||||
ws1.merge_cells('A1:H1')
|
||||
apply_cell(ws1, 1, 1, "付费用户 L1/L2 课消分析(剔除U0序章)", font=title_font, border_style=None, align=Alignment(horizontal='left'))
|
||||
|
||||
notes = [
|
||||
"口径:剔除L1/L2的U0序章课时(L1 U00: 343-348, L2 U00: 55-59),仅统计U1及之后的课消",
|
||||
"课消:用户首次完成某一课时;付费用户:status=1 + 未删除 + 有订单 + 未全部退款",
|
||||
]
|
||||
for i, n in enumerate(notes):
|
||||
ws1.merge_cells(f'A{3+i}:H{3+i}')
|
||||
apply_cell(ws1, 3+i, 1, n, font=Font(name='微软雅黑', size=9, color='666666'), border_style=None, align=Alignment(horizontal='left'))
|
||||
|
||||
# ===== Sheet 2: 每周明细 =====
|
||||
ws2 = wb.create_sheet("每周明细")
|
||||
headers_main = ['周', '周一起', '周日'] + ['合计付费', '合计有消', '合计无消', '合计课消', '合计人均', '合计有消人均',
|
||||
'仅L1付费', '仅L1有消', '仅L1无消', '仅L1课消', '仅L1人均', '仅L1有消人均',
|
||||
'仅L2付费', '仅L2有消', '仅L2无消', '仅L2课消', '仅L2人均', '仅L2有消人均',
|
||||
'L1+L2付费', 'L1+L2有消', 'L1+L2无消', 'L1+L2课消', 'L1+L2人均', 'L1+L2有消人均']
|
||||
|
||||
for j, h in enumerate(headers_main, 1):
|
||||
apply_header(ws2, 1, j, h)
|
||||
|
||||
for ri, r in enumerate(results):
|
||||
row = ri + 2
|
||||
wl = f"{r['ws'].strftime('%m/%d')}-{r['we'].strftime('%m/%d')}"
|
||||
apply_cell(ws2, row, 1, wl)
|
||||
apply_cell(ws2, row, 2, r['ws'].strftime('%Y-%m-%d'))
|
||||
apply_cell(ws2, row, 3, r['we'].strftime('%Y-%m-%d'))
|
||||
col = 4
|
||||
for prefix in ['合计', '仅L1', '仅L2', 'L1+L2']:
|
||||
for metric in ['paid', 'cons_users', 'no_cons', 'cons', 'avg_all', 'avg_cons']:
|
||||
val = r[f'{prefix}_{metric}']
|
||||
apply_cell(ws2, row, col, val if isinstance(val, str) else val)
|
||||
col += 1
|
||||
|
||||
for ci in range(1, len(headers_main)+1):
|
||||
ws2.column_dimensions[get_column_letter(ci)].width = 11 if ci <= 3 else 10
|
||||
ws2.freeze_panes = 'D2'
|
||||
|
||||
# ===== Sheet 3: L1 图表 =====
|
||||
sheet_names = {'仅L1': ('L1图表', 'L1', l1_color, '4A90D9'), '仅L2': ('L2图表', 'L2', l2_color, 'E85D47')}
|
||||
|
||||
for cat, (sname, label, color, light_color) in sheet_names.items():
|
||||
ws_chart_data = wb.create_sheet(sname)
|
||||
|
||||
# 只取该分类有付费用户的周
|
||||
first_idx = next((i for i, r in enumerate(results) if r[f'{cat}_paid'] > 0), 0)
|
||||
cat_results = results[first_idx:]
|
||||
|
||||
# Header
|
||||
headers = ['周', '付费用户', '有课消用户', '无课消用户', '课消总数', '人均课消', '有消人均']
|
||||
for j, h in enumerate(headers, 1):
|
||||
apply_header(ws_chart_data, 1, j, h)
|
||||
|
||||
for ri, r in enumerate(cat_results):
|
||||
row = ri + 2
|
||||
wl = f"{r['ws'].strftime('%m/%d')}"
|
||||
apply_cell(ws_chart_data, row, 1, wl)
|
||||
apply_cell(ws_chart_data, row, 2, r[f'{cat}_paid'])
|
||||
apply_cell(ws_chart_data, row, 3, r[f'{cat}_cons_users'])
|
||||
apply_cell(ws_chart_data, row, 4, r[f'{cat}_no_cons'])
|
||||
apply_cell(ws_chart_data, row, 5, r[f'{cat}_cons'])
|
||||
apply_cell(ws_chart_data, row, 6, r[f'{cat}_avg_all'])
|
||||
apply_cell(ws_chart_data, row, 7, r[f'{cat}_avg_cons'])
|
||||
|
||||
n_rows = len(cat_results)
|
||||
cats_ref = Reference(ws_chart_data, min_col=1, min_row=2, max_row=n_rows+1)
|
||||
|
||||
# --- Chart 1: 堆叠柱状图 (有课消/无课消) ---
|
||||
chart1 = BarChart()
|
||||
chart1.type = "col"
|
||||
chart1.grouping = "stacked"
|
||||
chart1.title = f"{label} 付费用户课消分布(剔除U0序章)"
|
||||
chart1.style = 10
|
||||
chart1.width = 24
|
||||
chart1.height = 13
|
||||
|
||||
# 有课消用户
|
||||
ref1 = Reference(ws_chart_data, min_col=3, min_row=1, max_row=n_rows+1)
|
||||
chart1.add_data(ref1, titles_from_data=True)
|
||||
chart1.set_categories(cats_ref)
|
||||
chart1.series[0].graphicalProperties.solidFill = light_color
|
||||
|
||||
# 无课消用户
|
||||
ref2 = Reference(ws_chart_data, min_col=4, min_row=1, max_row=n_rows+1)
|
||||
chart1.add_data(ref2, titles_from_data=True)
|
||||
chart1.series[1].graphicalProperties.solidFill = 'D9D9D9'
|
||||
|
||||
chart1.y_axis.title = '用户数'
|
||||
chart1.legend.position = 'b'
|
||||
ws_chart_data.add_chart(chart1, "A9")
|
||||
|
||||
# --- Chart 2: 折线图 (人均课消 + 有消人均) ---
|
||||
chart2 = LineChart()
|
||||
chart2.title = f"{label} 周人均课消趋势(剔除U0序章)"
|
||||
chart2.style = 10
|
||||
chart2.width = 24
|
||||
chart2.height = 13
|
||||
chart2.y_axis.title = '课消数(节/周)'
|
||||
|
||||
ref3 = Reference(ws_chart_data, min_col=6, min_row=1, max_row=n_rows+1)
|
||||
chart2.add_data(ref3, titles_from_data=True)
|
||||
chart2.set_categories(cats_ref)
|
||||
chart2.series[0].graphicalProperties.line.solidFill = '999999'
|
||||
chart2.series[0].graphicalProperties.line.width = 20000
|
||||
chart2.series[0].graphicalProperties.line.dashStyle = 'dash'
|
||||
|
||||
ref4 = Reference(ws_chart_data, min_col=7, min_row=1, max_row=n_rows+1)
|
||||
chart2.add_data(ref4, titles_from_data=True)
|
||||
chart2.series[1].graphicalProperties.line.solidFill = color
|
||||
chart2.series[1].graphicalProperties.line.width = 28000
|
||||
|
||||
chart2.y_axis.scaling.min = 0
|
||||
chart2.legend.position = 'b'
|
||||
ws_chart_data.add_chart(chart2, "A27")
|
||||
|
||||
# Column widths
|
||||
for ci in range(1, 8):
|
||||
ws_chart_data.column_dimensions[get_column_letter(ci)].width = 12
|
||||
|
||||
# ===== Sheet 4: L1+L2 图表(第三个分类)=====
|
||||
ws_l1l2 = wb.create_sheet("L1+L2图表")
|
||||
cat = 'L1+L2'
|
||||
color = l1l2_color
|
||||
light_color = 'A8C88E'
|
||||
first_idx = next((i for i, r in enumerate(results) if r[f'{cat}_paid'] > 0), 0)
|
||||
cat_results = results[first_idx:]
|
||||
|
||||
headers = ['周', '付费用户', '有课消用户', '无课消用户', '课消总数', '人均课消', '有消人均']
|
||||
for j, h in enumerate(headers, 1):
|
||||
apply_header(ws_l1l2, 1, j, h)
|
||||
|
||||
n_rows = len(cat_results)
|
||||
for ri, r in enumerate(cat_results):
|
||||
row = ri + 2
|
||||
wl = f"{r['ws'].strftime('%m/%d')}"
|
||||
apply_cell(ws_l1l2, row, 1, wl)
|
||||
apply_cell(ws_l1l2, row, 2, r[f'{cat}_paid'])
|
||||
apply_cell(ws_l1l2, row, 3, r[f'{cat}_cons_users'])
|
||||
apply_cell(ws_l1l2, row, 4, r[f'{cat}_no_cons'])
|
||||
apply_cell(ws_l1l2, row, 5, r[f'{cat}_cons'])
|
||||
apply_cell(ws_l1l2, row, 6, r[f'{cat}_avg_all'])
|
||||
apply_cell(ws_l1l2, row, 7, r[f'{cat}_avg_cons'])
|
||||
|
||||
cats_ref = Reference(ws_l1l2, min_col=1, min_row=2, max_row=n_rows+1)
|
||||
|
||||
chart1 = BarChart()
|
||||
chart1.type = "col"
|
||||
chart1.grouping = "stacked"
|
||||
chart1.title = f"L1+L2 付费用户课消分布(剔除U0序章)"
|
||||
chart1.style = 10
|
||||
chart1.width = 24
|
||||
chart1.height = 13
|
||||
|
||||
ref1 = Reference(ws_l1l2, min_col=3, min_row=1, max_row=n_rows+1)
|
||||
chart1.add_data(ref1, titles_from_data=True)
|
||||
chart1.set_categories(cats_ref)
|
||||
chart1.series[0].graphicalProperties.solidFill = light_color
|
||||
|
||||
ref2 = Reference(ws_l1l2, min_col=4, min_row=1, max_row=n_rows+1)
|
||||
chart1.add_data(ref2, titles_from_data=True)
|
||||
chart1.series[1].graphicalProperties.solidFill = 'D9D9D9'
|
||||
|
||||
chart1.y_axis.title = '用户数'
|
||||
chart1.legend.position = 'b'
|
||||
ws_l1l2.add_chart(chart1, "A9")
|
||||
|
||||
chart2 = LineChart()
|
||||
chart2.title = f"L1+L2 周人均课消趋势(剔除U0序章)"
|
||||
chart2.style = 10
|
||||
chart2.width = 24
|
||||
chart2.height = 13
|
||||
chart2.y_axis.title = '课消数(节/周)'
|
||||
|
||||
ref3 = Reference(ws_l1l2, min_col=6, min_row=1, max_row=n_rows+1)
|
||||
chart2.add_data(ref3, titles_from_data=True)
|
||||
chart2.set_categories(cats_ref)
|
||||
chart2.series[0].graphicalProperties.line.solidFill = '999999'
|
||||
chart2.series[0].graphicalProperties.line.width = 20000
|
||||
chart2.series[0].graphicalProperties.line.dashStyle = 'dash'
|
||||
|
||||
ref4 = Reference(ws_l1l2, min_col=7, min_row=1, max_row=n_rows+1)
|
||||
chart2.add_data(ref4, titles_from_data=True)
|
||||
chart2.series[1].graphicalProperties.line.solidFill = color
|
||||
chart2.series[1].graphicalProperties.line.width = 28000
|
||||
|
||||
chart2.y_axis.scaling.min = 0
|
||||
chart2.legend.position = 'b'
|
||||
ws_l1l2.add_chart(chart2, "A27")
|
||||
|
||||
for ci in range(1, 8):
|
||||
ws_l1l2.column_dimensions[get_column_letter(ci)].width = 12
|
||||
|
||||
# 保存
|
||||
path = '/root/.openclaw/workspace/output/course_consumption_by_level_v2.xlsx'
|
||||
wb.save(path)
|
||||
print(f"\n✅ Excel v2 已保存: {path}")
|
||||
|
||||
# 简要摘要
|
||||
last = results[-1]
|
||||
print(f"""
|
||||
=== 剔除U0后最终数据(截至5/10) ===
|
||||
仅L1: 付费{last['仅L1_paid']} 有消{last['仅L1_cons_users']} 无消{last['仅L1_no_cons']} 人均{last['仅L1_avg_all']} 有消人均{last['仅L1_avg_cons']}
|
||||
仅L2: 付费{last['仅L2_paid']} 有消{last['仅L2_cons_users']} 无消{last['仅L2_no_cons']} 人均{last['仅L2_avg_all']} 有消人均{last['仅L2_avg_cons']}
|
||||
L1+L2: 付费{last['L1+L2_paid']} 有消{last['L1+L2_cons_users']} 无消{last['L1+L2_no_cons']} 人均{last['L1+L2_avg_all']} 有消人均{last['L1+L2_avg_cons']}
|
||||
合计: 付费{last['合计_paid']} 有消{last['合计_cons_users']} 无消{last['合计_no_cons']} 人均{last['合计_avg_all']} 有消人均{last['合计_avg_cons']}
|
||||
""")
|
||||
287
scripts/course_excel_v3.py
Normal file
@ -0,0 +1,287 @@
|
||||
#!/usr/bin/env python3
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
import openpyxl
|
||||
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
|
||||
from openpyxl.chart import LineChart, BarChart, Reference
|
||||
from openpyxl.utils import get_column_letter
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
u0_chapters = {55, 56, 57, 58, 59, 343, 344, 345, 346, 348}
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
we = d + timedelta(days=6 - d.weekday())
|
||||
if we >= overall_end: we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
print("分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_levels = defaultdict(set)
|
||||
user_orders = defaultdict(list)
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_levels[aid].add(lt)
|
||||
user_orders[aid].append((pay_date.date(), is_refunded))
|
||||
|
||||
def is_paid(aid, as_of):
|
||||
return sum(1 for pd, ref in user_orders[aid] if pd <= as_of and not ref) > 0
|
||||
|
||||
l1_pool = {aid for aid, lv in user_levels.items() if 'L1' in lv or 'L1+L2' in lv}
|
||||
l2_pool = {aid for aid, lv in user_levels.items() if 'L2' in lv or 'L1+L2' in lv}
|
||||
all_pool = l1_pool | l2_pool
|
||||
|
||||
print(f"L1池: {len(l1_pool)}, L2池: {len(l2_pool)}, 合计: {len(all_pool)}")
|
||||
|
||||
print("查询课消...")
|
||||
cons_map = {}
|
||||
for ti in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{ti}"
|
||||
cur.execute(f"""SELECT user_id, chapter_id, updated_at FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'""")
|
||||
for uid, cid, ua in cur.fetchall():
|
||||
if cid in u0_chapters: continue
|
||||
key = (uid, cid)
|
||||
d = ua.date() if hasattr(ua, 'date') else datetime.strptime(str(ua)[:10], '%Y-%m-%d').date()
|
||||
if key not in cons_map or d < cons_map[key]:
|
||||
cons_map[key] = d
|
||||
|
||||
print("角色映射...")
|
||||
all_uids = list(set(k[0] for k in cons_map))
|
||||
char2acct = {}
|
||||
for i in range(0, len(all_uids), 500):
|
||||
batch = all_uids[i:i+500]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall(): char2acct[cid] = aid
|
||||
|
||||
print("按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
l1_paid = {aid for aid in l1_pool if is_paid(aid, we)}
|
||||
l2_paid = {aid for aid in l2_pool if is_paid(aid, we)}
|
||||
t_paid = {aid for aid in all_pool if is_paid(aid, we)}
|
||||
|
||||
l1_cons, l1_cons_users = 0, set()
|
||||
l2_cons, l2_cons_users = 0, set()
|
||||
t_cons, t_cu = 0, set()
|
||||
|
||||
for (uid, ch_id), cons_date in cons_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if not aid: continue
|
||||
if aid in l1_paid:
|
||||
l1_cons += 1
|
||||
l1_cons_users.add(aid)
|
||||
if aid in l2_paid:
|
||||
l2_cons += 1
|
||||
l2_cons_users.add(aid)
|
||||
if aid in t_paid:
|
||||
t_cons += 1
|
||||
t_cu.add(aid)
|
||||
|
||||
results.append({
|
||||
'ws': ws, 'we': we,
|
||||
'L1_paid': len(l1_paid), 'L1_cons': l1_cons, 'L1_cons_users': len(l1_cons_users),
|
||||
'L1_no_cons': len(l1_paid) - len(l1_cons_users),
|
||||
'L1_avg_all': round(l1_cons / len(l1_paid), 2) if l1_paid else 0,
|
||||
'L1_avg_cons': round(l1_cons / len(l1_cons_users), 2) if l1_cons_users else 0,
|
||||
'L2_paid': len(l2_paid), 'L2_cons': l2_cons, 'L2_cons_users': len(l2_cons_users),
|
||||
'L2_no_cons': len(l2_paid) - len(l2_cons_users),
|
||||
'L2_avg_all': round(l2_cons / len(l2_paid), 2) if l2_paid else 0,
|
||||
'L2_avg_cons': round(l2_cons / len(l2_cons_users), 2) if l2_cons_users else 0,
|
||||
'total_paid': len(t_paid), 'total_cons': t_cons, 'total_cons_users': len(t_cu),
|
||||
'total_no_cons': len(t_paid) - len(t_cu),
|
||||
'total_avg_all': round(t_cons / len(t_paid), 2) if t_paid else 0,
|
||||
'total_avg_cons': round(t_cons / len(t_cu), 2) if t_cu else 0,
|
||||
})
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
print("\n生成 Excel...")
|
||||
wb = openpyxl.Workbook()
|
||||
wb.remove(wb.active)
|
||||
|
||||
hfont = Font(name='微软雅黑', bold=True, size=9, color='FFFFFF')
|
||||
hfill = PatternFill(start_color='2F5496', end_color='2F5496', fill_type='solid')
|
||||
dfont = Font(name='微软雅黑', size=9)
|
||||
tfont = Font(name='微软雅黑', bold=True, size=14, color='2F5496')
|
||||
sfont = Font(name='微软雅黑', bold=True, size=11, color='2F5496')
|
||||
bd = Border(left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin'))
|
||||
ctr = Alignment(horizontal='center', vertical='center')
|
||||
|
||||
def ac(ws, r, c, v, font=dfont, fill=None, align=ctr):
|
||||
cl = ws.cell(row=r, column=c, value=v)
|
||||
cl.font, cl.border, cl.alignment = font, bd, align
|
||||
if fill: cl.fill = fill
|
||||
return cl
|
||||
|
||||
def ah(ws, r, c, v):
|
||||
cl = ws.cell(row=r, column=c, value=v)
|
||||
cl.font, cl.fill, cl.border, cl.alignment = hfont, hfill, bd, ctr
|
||||
return cl
|
||||
|
||||
# Sheet 1: 概览
|
||||
ws1 = wb.create_sheet("概览")
|
||||
ws1.merge_cells('A1:H1')
|
||||
ac(ws1, 1, 1, "付费用户课消分析(剔除U0序章)", font=tfont, fill=None, align=Alignment(horizontal='left'))
|
||||
|
||||
notes = [
|
||||
"口径:L1付费用户 = 买过L1商品(含L1+L2)的付费用户 | L2付费用户 = 买过L2商品(含L1+L2)的付费用户",
|
||||
"L1+L2用户同时出现在L1和L2两个视角中 | 合计为去重统计",
|
||||
"课消:用户首次完成某一课时(剔除U0序章,仅U1+)",
|
||||
"付费用户:status=1 + 未删除 + 有未退款订单",
|
||||
]
|
||||
for i, n in enumerate(notes):
|
||||
ws1.merge_cells(f'A{3+i}:H{3+i}')
|
||||
ac(ws1, 3+i, 1, n, font=Font(name='微软雅黑', size=9, color='666666'), fill=None, align=Alignment(horizontal='left'))
|
||||
|
||||
row = 9
|
||||
ws1.merge_cells(f'A{row}:H{row}')
|
||||
ac(ws1, row, 1, "汇总(截至最后一周)", font=sfont, fill=None, align=Alignment(horizontal='left'))
|
||||
row += 1
|
||||
|
||||
for j, h in enumerate(['分类', '付费用户', '有课消', '无课消', '无课消率', '人均课消', '有消人均'], 1):
|
||||
ah(ws1, row, j, h)
|
||||
row += 1
|
||||
|
||||
last = results[-1]
|
||||
summary = [
|
||||
('L1付费群', last['L1_paid'], last['L1_cons_users'], last['L1_no_cons'], last['L1_avg_all'], last['L1_avg_cons'], '#A8CFF1'),
|
||||
('L2付费群', last['L2_paid'], last['L2_cons_users'], last['L2_no_cons'], last['L2_avg_all'], last['L2_avg_cons'], '#F4A9A0'),
|
||||
('合计(去重)', last['total_paid'], last['total_cons_users'], last['total_no_cons'], last['total_avg_all'], last['total_avg_cons'], '#C8E6C9'),
|
||||
]
|
||||
for name, p, cu, nc, aa, ac_, clr in summary:
|
||||
no_rate = f"{nc/p*100:.0f}%" if p else "0%"
|
||||
fl = PatternFill(start_color='00'+clr[1:], end_color='00'+clr[1:], fill_type='solid')
|
||||
for j, v in enumerate([name, p, cu, nc, no_rate, aa, ac_], 1):
|
||||
f = Font(name='微软雅黑', bold=(j==1), size=10)
|
||||
ac(ws1, row, j, v, font=f, fill=fl)
|
||||
row += 1
|
||||
|
||||
# Sheet 2: 每周明细
|
||||
ws2 = wb.create_sheet("每周明细")
|
||||
headers = ['周', '周一起', '周日']
|
||||
for prefix in ['合计', 'L1付费群', 'L2付费群']:
|
||||
for m in ['付费', '有消', '无消', '课消', '人均', '有消人均']:
|
||||
headers.append(f'{prefix}{m}')
|
||||
|
||||
for j, h in enumerate(headers, 1):
|
||||
ah(ws2, 1, j, h)
|
||||
|
||||
for ri, r in enumerate(results):
|
||||
rw = ri + 2
|
||||
ac(ws2, rw, 1, r['ws'].strftime('%m/%d'))
|
||||
ac(ws2, rw, 2, r['ws'].strftime('%Y-%m-%d'))
|
||||
ac(ws2, rw, 3, r['we'].strftime('%Y-%m-%d'))
|
||||
col = 4
|
||||
for prefix in ['total', 'L1', 'L2']:
|
||||
for k in ['paid', 'cons_users', 'no_cons', 'cons', 'avg_all', 'avg_cons']:
|
||||
ac(ws2, rw, col, r[f'{prefix}_{k}'])
|
||||
col += 1
|
||||
|
||||
for ci in range(1, len(headers)+1):
|
||||
ws2.column_dimensions[get_column_letter(ci)].width = 11 if ci <= 3 else 10
|
||||
ws2.freeze_panes = 'D2'
|
||||
|
||||
# Sheet 3: L1图表
|
||||
ws_l1 = wb.create_sheet("L1图表")
|
||||
lh = ['周', '付费用户', '有课消用户', '无课消用户', '课消总数', '人均课消', '有消人均']
|
||||
first = next(i for i, r in enumerate(results) if r['L1_paid'] > 0)
|
||||
l1d = results[first:]
|
||||
for j, h in enumerate(lh, 1): ah(ws_l1, 1, j, h)
|
||||
for ri, r in enumerate(l1d):
|
||||
rw = ri + 2
|
||||
ac(ws_l1, rw, 1, r['ws'].strftime('%m/%d'))
|
||||
for j, k in enumerate(['L1_paid','L1_cons_users','L1_no_cons','L1_cons','L1_avg_all','L1_avg_cons'], 2):
|
||||
ac(ws_l1, rw, j, r[k])
|
||||
|
||||
n = len(l1d)
|
||||
cr = Reference(ws_l1, min_col=1, min_row=2, max_row=n+1)
|
||||
|
||||
ch1 = BarChart(); ch1.type = "col"; ch1.grouping = "stacked"
|
||||
ch1.title = "L1付费用户周课消分布(剔除U0序章)"; ch1.style = 10; ch1.width = 24; ch1.height = 13
|
||||
r1 = Reference(ws_l1, min_col=3, min_row=1, max_row=n+1); ch1.add_data(r1, titles_from_data=True)
|
||||
r2 = Reference(ws_l1, min_col=4, min_row=1, max_row=n+1); ch1.add_data(r2, titles_from_data=True)
|
||||
ch1.set_categories(cr)
|
||||
ch1.series[0].graphicalProperties.solidFill = 'A8CFF1'
|
||||
ch1.series[1].graphicalProperties.solidFill = 'D9D9D9'
|
||||
ch1.y_axis.title = '用户数'; ch1.legend.position = 'b'
|
||||
ws_l1.add_chart(ch1, "A9")
|
||||
|
||||
ch2 = LineChart(); ch2.title = "L1付费用户周人均课消趋势(剔除U0序章)"; ch2.style = 10; ch2.width = 24; ch2.height = 13
|
||||
r3 = Reference(ws_l1, min_col=6, min_row=1, max_row=n+1); ch2.add_data(r3, titles_from_data=True)
|
||||
r4 = Reference(ws_l1, min_col=7, min_row=1, max_row=n+1); ch2.add_data(r4, titles_from_data=True)
|
||||
ch2.set_categories(cr)
|
||||
ch2.series[0].graphicalProperties.line.solidFill = '999999'; ch2.series[0].graphicalProperties.line.width = 20000
|
||||
ch2.series[1].graphicalProperties.line.solidFill = '4A90D9'; ch2.series[1].graphicalProperties.line.width = 28000
|
||||
ch2.y_axis.scaling.min = 0; ch2.y_axis.title = '课消数(节/周)'; ch2.legend.position = 'b'
|
||||
ws_l1.add_chart(ch2, "A27")
|
||||
for ci in range(1, 8): ws_l1.column_dimensions[get_column_letter(ci)].width = 12
|
||||
|
||||
# Sheet 4: L2图表
|
||||
ws_l2 = wb.create_sheet("L2图表")
|
||||
first2 = next(i for i, r in enumerate(results) if r['L2_paid'] > 0)
|
||||
l2d = results[first2:]
|
||||
for j, h in enumerate(lh, 1): ah(ws_l2, 1, j, h)
|
||||
for ri, r in enumerate(l2d):
|
||||
rw = ri + 2
|
||||
ac(ws_l2, rw, 1, r['ws'].strftime('%m/%d'))
|
||||
for j, k in enumerate(['L2_paid','L2_cons_users','L2_no_cons','L2_cons','L2_avg_all','L2_avg_cons'], 2):
|
||||
ac(ws_l2, rw, j, r[k])
|
||||
|
||||
n2 = len(l2d)
|
||||
cr2 = Reference(ws_l2, min_col=1, min_row=2, max_row=n2+1)
|
||||
|
||||
ch3 = BarChart(); ch3.type = "col"; ch3.grouping = "stacked"
|
||||
ch3.title = "L2付费用户周课消分布(剔除U0序章)"; ch3.style = 10; ch3.width = 24; ch3.height = 13
|
||||
r5 = Reference(ws_l2, min_col=3, min_row=1, max_row=n2+1); ch3.add_data(r5, titles_from_data=True)
|
||||
r6 = Reference(ws_l2, min_col=4, min_row=1, max_row=n2+1); ch3.add_data(r6, titles_from_data=True)
|
||||
ch3.set_categories(cr2)
|
||||
ch3.series[0].graphicalProperties.solidFill = 'F4A9A0'
|
||||
ch3.series[1].graphicalProperties.solidFill = 'D9D9D9'
|
||||
ch3.y_axis.title = '用户数'; ch3.legend.position = 'b'
|
||||
ws_l2.add_chart(ch3, "A9")
|
||||
|
||||
ch4 = LineChart(); ch4.title = "L2付费用户周人均课消趋势(剔除U0序章)"; ch4.style = 10; ch4.width = 24; ch4.height = 13
|
||||
r7 = Reference(ws_l2, min_col=6, min_row=1, max_row=n2+1); ch4.add_data(r7, titles_from_data=True)
|
||||
r8 = Reference(ws_l2, min_col=7, min_row=1, max_row=n2+1); ch4.add_data(r8, titles_from_data=True)
|
||||
ch4.set_categories(cr2)
|
||||
ch4.series[0].graphicalProperties.line.solidFill = '999999'; ch4.series[0].graphicalProperties.line.width = 20000
|
||||
ch4.series[1].graphicalProperties.line.solidFill = 'E85D47'; ch4.series[1].graphicalProperties.line.width = 28000
|
||||
ch4.y_axis.scaling.min = 0; ch4.y_axis.title = '课消数(节/周)'; ch4.legend.position = 'b'
|
||||
ws_l2.add_chart(ch4, "A27")
|
||||
for ci in range(1, 8): ws_l2.column_dimensions[get_column_letter(ci)].width = 12
|
||||
|
||||
path = '/root/.openclaw/workspace/output/course_consumption_by_level_v3.xlsx'
|
||||
wb.save(path)
|
||||
print(f"\n✅ {path}")
|
||||
print(f"L1付费群: {last['L1_paid']}人 | L2付费群: {last['L2_paid']}人 | 合计(去重): {last['total_paid']}人")
|
||||
print(f"L1无消率: {last['L1_no_cons']/last['L1_paid']*100:.0f}% | L2无消率: {last['L2_no_cons']/last['L2_paid']*100:.0f}%")
|
||||
129
scripts/excel_v4.py
Normal file
@ -0,0 +1,129 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Excel v4: L1只看L1课程, L2只看L2课程"""
|
||||
import json, openpyxl
|
||||
from datetime import date
|
||||
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
|
||||
from openpyxl.chart import LineChart, BarChart, Reference
|
||||
from openpyxl.utils import get_column_letter
|
||||
|
||||
with open('/root/.openclaw/workspace/output/course_data_v4.json') as f:
|
||||
raw = json.load(f)
|
||||
results = raw['results']
|
||||
|
||||
for r in results:
|
||||
r['ws'] = date.fromisoformat(r['ws'])
|
||||
r['we'] = date.fromisoformat(r['we'])
|
||||
|
||||
wb = openpyxl.Workbook()
|
||||
wb.remove(wb.active)
|
||||
hfont = Font(name='微软雅黑', bold=True, size=9, color='FFFFFF')
|
||||
hfill = PatternFill(start_color='002F5496', end_color='002F5496', fill_type='solid')
|
||||
dfont = Font(name='微软雅黑', size=9)
|
||||
tfont = Font(name='微软雅黑', bold=True, size=14, color='002F5496')
|
||||
sfont = Font(name='微软雅黑', bold=True, size=11, color='002F5496')
|
||||
bd = Border(left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin'))
|
||||
ctr = Alignment(horizontal='center', vertical='center')
|
||||
|
||||
def ac(ws, r, c, v, font=dfont, fill=None, align=ctr):
|
||||
cl = ws.cell(row=r, column=c, value=v)
|
||||
cl.font, cl.border, cl.alignment = font, bd, align
|
||||
if fill: cl.fill = fill
|
||||
|
||||
def ah(ws, r, c, v):
|
||||
cl = ws.cell(row=r, column=c, value=v)
|
||||
cl.font, cl.fill, cl.border, cl.alignment = hfont, hfill, bd, ctr
|
||||
|
||||
# Sheet 1
|
||||
ws1 = wb.create_sheet("概览")
|
||||
ws1.merge_cells('A1:H1')
|
||||
ac(ws1,1,1,"付费用户课消分析 v4(只看对应级别课程,剔除U0)",font=tfont,align=Alignment(horizontal='left'))
|
||||
notes = [
|
||||
"口径:L1付费群 = 买过L1商品的付费用户, 只看L1课程课消 | L2付费群 = 买过L2商品的付费用户, 只看L2课程课消",
|
||||
"L1+L2用户:在L1视角只统计L1课程课消, L2视角只统计L2课程课消",
|
||||
"课消:用户首次完成某一课时(剔除U0序章)",
|
||||
"付费用户:status=1 + 未删除 + 有未退款订单",
|
||||
]
|
||||
for i,n in enumerate(notes):
|
||||
ws1.merge_cells(f'A{3+i}:H{3+i}')
|
||||
ac(ws1,3+i,1,n,font=Font(name='微软雅黑',size=9,color='666666'),align=Alignment(horizontal='left'))
|
||||
|
||||
row=9
|
||||
ws1.merge_cells(f'A{row}:H{row}')
|
||||
ac(ws1,row,1,"汇总(截至最后一周)",font=sfont,align=Alignment(horizontal='left'))
|
||||
row+=1
|
||||
for j,h in enumerate(['分类','付费用户','有课消','无课消','无课消率','人均课消','有消人均'],1):
|
||||
ah(ws1,row,j,h)
|
||||
row+=1
|
||||
|
||||
last=results[-1]
|
||||
skus = [
|
||||
('L1付费群(只看L1课程)', last['L1_paid'],last['L1_cons_users'],last['L1_no_cons'],last['L1_avg_all'],last['L1_avg_cons'], '00A8CFF1'),
|
||||
('L2付费群(只看L2课程)', last['L2_paid'],last['L2_cons_users'],last['L2_no_cons'],last['L2_avg_all'],last['L2_avg_cons'], '00F4A9A0'),
|
||||
('合计(去重)', last['total_paid'],last['total_cons_users'],last['total_no_cons'],last['total_avg_all'],last['total_avg_cons'], '00C8E6C9'),
|
||||
]
|
||||
for name,p,cu,nc,aa,ac_,clr in skus:
|
||||
no_rate=f"{nc/p*100:.0f}%" if p else "0%"
|
||||
fl=PatternFill(start_color=clr,end_color=clr,fill_type='solid')
|
||||
for j,v in enumerate([name,p,cu,nc,no_rate,aa,ac_],1):
|
||||
ac(ws1,row,j,v,font=Font(name='微软雅黑',bold=(j==1),size=10),fill=fl)
|
||||
row+=1
|
||||
|
||||
# Sheet 2
|
||||
ws2=wb.create_sheet("每周明细")
|
||||
headers=['周','周一起','周日']
|
||||
for pfx in ['合计','L1付费群','L2付费群']:
|
||||
for m in ['付费','有消','无消','课消','人均','有消人均']:
|
||||
headers.append(f'{pfx}{m}')
|
||||
for j,h in enumerate(headers,1): ah(ws2,1,j,h)
|
||||
for ri,r in enumerate(results):
|
||||
rw=ri+2
|
||||
ac(ws2,rw,1,r['ws'].strftime('%m/%d'))
|
||||
ac(ws2,rw,2,r['ws'].strftime('%Y-%m-%d'))
|
||||
ac(ws2,rw,3,r['we'].strftime('%Y-%m-%d'))
|
||||
col=4
|
||||
for prefix in ['total','L1','L2']:
|
||||
for k in ['paid','cons_users','no_cons','cons','avg_all','avg_cons']:
|
||||
ac(ws2,rw,col,r[f'{prefix}_{k}'])
|
||||
col+=1
|
||||
for ci in range(1,len(headers)+1):
|
||||
ws2.column_dimensions[get_column_letter(ci)].width=11 if ci<=3 else 10
|
||||
ws2.freeze_panes='D2'
|
||||
|
||||
# Sheet 3+4: charts
|
||||
for lvl, pf, clr in [('L1','L1','4A90D9'),('L2','L2','E85D47')]:
|
||||
ws=wb.create_sheet(f"{pf}图表")
|
||||
lh=['周','付费用户','有课消用户','无课消用户','课消总数','人均课消','有消人均']
|
||||
first=next(i for i,r in enumerate(results) if r[f'{pf}_paid']>0)
|
||||
ld=results[first:]
|
||||
for j,h in enumerate(lh,1): ah(ws,1,j,h)
|
||||
for ri,r in enumerate(ld):
|
||||
rw=ri+2
|
||||
ac(ws,rw,1,r['ws'].strftime('%m/%d'))
|
||||
for j,k in enumerate([f'{pf}_paid',f'{pf}_cons_users',f'{pf}_no_cons',f'{pf}_cons',f'{pf}_avg_all',f'{pf}_avg_cons'],2):
|
||||
ac(ws,rw,j,r[k])
|
||||
n=len(ld)
|
||||
cr=Reference(ws,min_col=1,min_row=2,max_row=n+1)
|
||||
|
||||
ch1=BarChart(); ch1.type="col"; ch1.grouping="stacked"
|
||||
ch1.title=f"{pf}付费用户周课消分布(只看{pf}课程)"; ch1.style=10; ch1.width=24; ch1.height=13
|
||||
ch1.add_data(Reference(ws,min_col=3,min_row=1,max_row=n+1),titles_from_data=True)
|
||||
ch1.add_data(Reference(ws,min_col=4,min_row=1,max_row=n+1),titles_from_data=True)
|
||||
ch1.set_categories(cr)
|
||||
ch1.series[0].graphicalProperties.solidFill='A8CFF1' if pf=='L1' else 'F4A9A0'
|
||||
ch1.series[1].graphicalProperties.solidFill='D9D9D9'
|
||||
ch1.y_axis.title='用户数'; ch1.legend.position='b'
|
||||
ws.add_chart(ch1,"A9")
|
||||
|
||||
ch2=LineChart(); ch2.title=f"{pf}付费用户周人均课消趋势(只看{pf}课程)"; ch2.style=10; ch2.width=24; ch2.height=13
|
||||
ch2.add_data(Reference(ws,min_col=6,min_row=1,max_row=n+1),titles_from_data=True)
|
||||
ch2.add_data(Reference(ws,min_col=7,min_row=1,max_row=n+1),titles_from_data=True)
|
||||
ch2.set_categories(cr)
|
||||
ch2.series[0].graphicalProperties.line.solidFill='999999'; ch2.series[0].graphicalProperties.line.width=20000
|
||||
ch2.series[1].graphicalProperties.line.solidFill=clr; ch2.series[1].graphicalProperties.line.width=28000
|
||||
ch2.y_axis.scaling.min=0; ch2.y_axis.title='课消数(节/周)'; ch2.legend.position='b'
|
||||
ws.add_chart(ch2,"A27")
|
||||
for ci in range(1,8): ws.column_dimensions[get_column_letter(ci)].width=12
|
||||
|
||||
path='/root/.openclaw/workspace/output/course_consumption_by_level_v4.xlsx'
|
||||
wb.save(path)
|
||||
print(f'✅ {path}')
|
||||
247
scripts/generate_charts.py
Normal file
@ -0,0 +1,247 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
生成 4 张课消图表(剔除U0序章):
|
||||
1. L1 付费用户课消分布(堆叠柱状图)
|
||||
2. L2 付费用户课消分布(堆叠柱状图)
|
||||
3. L1 周人均课消趋势(折线图)
|
||||
4. L2 周人均课消趋势(折线图)
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
import matplotlib.pyplot as plt
|
||||
import matplotlib.dates as mdates
|
||||
import matplotlib.ticker as ticker
|
||||
import numpy as np
|
||||
|
||||
# 中文字体
|
||||
import matplotlib.font_manager as fm
|
||||
font_path = '/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc'
|
||||
fm.fontManager.addfont(font_path)
|
||||
prop = fm.FontProperties(fname=font_path)
|
||||
font_name = prop.get_name()
|
||||
plt.rcParams['font.family'] = font_name
|
||||
plt.rcParams['axes.unicode_minus'] = False
|
||||
print(f'使用字体: {font_name}')
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# ===== 配置 =====
|
||||
u0_chapters = {55, 56, 57, 58, 59, 343, 344, 345, 346, 348}
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
we = d + timedelta(days=6 - d.weekday())
|
||||
if we >= overall_end:
|
||||
we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
# ===== 用户分类 =====
|
||||
print("分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_data = defaultdict(lambda: {'levels': set(), 'orders': []})
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_data[aid]['levels'].add(lt)
|
||||
user_data[aid]['orders'].append((pay_date.date(), is_refunded, lt))
|
||||
|
||||
def classify(levels):
|
||||
h1, h2 = 'L1' in levels, 'L2' in levels
|
||||
return 'L1+L2' if ('L1+L2' in levels or (h1 and h2)) else ('仅L1' if h1 else ('仅L2' if h2 else '其他'))
|
||||
|
||||
for aid in user_data:
|
||||
user_data[aid]['category'] = classify(user_data[aid]['levels'])
|
||||
|
||||
def is_paid(aid, as_of):
|
||||
return sum(1 for pd, ref, lt in user_data[aid]['orders'] if pd <= as_of and not ref) > 0
|
||||
|
||||
# ===== 课消 =====
|
||||
print("查询课消...")
|
||||
cons_map = {}
|
||||
for table_idx in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{table_idx}"
|
||||
cur.execute(f"""
|
||||
SELECT user_id, chapter_id, updated_at
|
||||
FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'
|
||||
""")
|
||||
for uid, cid, ua in cur.fetchall():
|
||||
if cid in u0_chapters: continue
|
||||
key = (uid, cid)
|
||||
d = ua.date() if hasattr(ua, 'date') else datetime.strptime(str(ua)[:10], '%Y-%m-%d').date()
|
||||
if key not in cons_map or d < cons_map[key]:
|
||||
cons_map[key] = d
|
||||
|
||||
# 角色映射
|
||||
print("角色映射...")
|
||||
all_uids = list(set(k[0] for k in cons_map))
|
||||
char2acct = {}
|
||||
bs = 500
|
||||
for i in range(0, len(all_uids), bs):
|
||||
batch = all_uids[i:i+bs]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall():
|
||||
char2acct[cid] = aid
|
||||
|
||||
# ===== 按周汇总 =====
|
||||
print("按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
paid_by_cat = defaultdict(set)
|
||||
for aid in user_data:
|
||||
if is_paid(aid, we):
|
||||
paid_by_cat[user_data[aid]['category']].add(aid)
|
||||
|
||||
cons_by_cat = defaultdict(int)
|
||||
cons_users_by_cat = defaultdict(set)
|
||||
|
||||
for (uid, ch_id), cons_date in cons_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if aid:
|
||||
cat = user_data.get(aid, {}).get('category', '其他')
|
||||
if aid in paid_by_cat.get(cat, set()):
|
||||
cons_by_cat[cat] += 1
|
||||
cons_users_by_cat[cat].add(aid)
|
||||
|
||||
row = {'ws': ws, 'we': we}
|
||||
for cat in ['仅L1', '仅L2', 'L1+L2']:
|
||||
n_paid = len(paid_by_cat.get(cat, set()))
|
||||
n_cons = cons_by_cat.get(cat, 0)
|
||||
n_cons_users = len(cons_users_by_cat.get(cat, set()))
|
||||
row[f'{cat}_paid'] = n_paid
|
||||
row[f'{cat}_cons'] = n_cons
|
||||
row[f'{cat}_cons_users'] = n_cons_users
|
||||
row[f'{cat}_no_cons'] = n_paid - n_cons_users
|
||||
row[f'{cat}_avg_all'] = round(n_cons / n_paid, 2) if n_paid > 0 else 0
|
||||
row[f'{cat}_avg_cons'] = round(n_cons / n_cons_users, 2) if n_cons_users > 0 else 0
|
||||
results.append(row)
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# ===== 图表生成 =====
|
||||
print("\n生成图表...")
|
||||
output_dir = '/root/.openclaw/workspace/output'
|
||||
|
||||
configs = {
|
||||
'L1': {'cat': '仅L1', 'color': '#4A90D9', 'light': '#A8CFF1', 'label': 'L1'},
|
||||
'L2': {'cat': '仅L2', 'color': '#E85D47', 'light': '#F4A9A0', 'label': 'L2'},
|
||||
}
|
||||
|
||||
for key, cfg in configs.items():
|
||||
cat = cfg['cat']
|
||||
color = cfg['color']
|
||||
light = cfg['light']
|
||||
label = cfg['label']
|
||||
|
||||
# 过滤无数据周
|
||||
first = next(i for i, r in enumerate(results) if r[f'{cat}_paid'] > 0)
|
||||
data = results[first:]
|
||||
|
||||
xs = [r['ws'] + timedelta(days=3) for r in data]
|
||||
labels = [r['ws'].strftime('%m/%d') for r in data]
|
||||
paid = [r[f'{cat}_paid'] for r in data]
|
||||
cons_users = [r[f'{cat}_cons_users'] for r in data]
|
||||
no_cons = [r[f'{cat}_no_cons'] for r in data]
|
||||
avg_all = [r[f'{cat}_avg_all'] for r in data]
|
||||
avg_cons = [r[f'{cat}_avg_cons'] for r in data]
|
||||
|
||||
# --- 图1: 堆叠柱状图 ---
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
|
||||
x_idx = np.arange(len(xs))
|
||||
bar_w = 0.65
|
||||
|
||||
p1 = ax.bar(x_idx, cons_users, bar_w, color=light, label='有课消用户', zorder=3)
|
||||
p2 = ax.bar(x_idx, no_cons, bar_w, bottom=cons_users, color='#D0D0D0', label='无课消用户', zorder=3)
|
||||
|
||||
# 标注付费总数
|
||||
for i, (p, c, n) in enumerate(zip(paid, cons_users, no_cons)):
|
||||
if i % max(1, len(data)//12) == 0:
|
||||
ax.annotate(str(p), (i, p), textcoords='offset points', xytext=(0, 6),
|
||||
fontsize=8, ha='center', color='#333333', fontweight='bold')
|
||||
|
||||
ax.set_xticks(x_idx[::max(1, len(data)//12)])
|
||||
ax.set_xticklabels([labels[i] for i in range(0, len(data), max(1, len(data)//12))], fontsize=9, rotation=45)
|
||||
|
||||
ax.set_ylabel('用户数', fontsize=13)
|
||||
ax.set_title(f'{label} 付费用户周课消分布(剔除U0序章)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(axis='y', alpha=0.3, zorder=0)
|
||||
ax.set_xlim(-0.5, len(x_idx) - 0.5)
|
||||
|
||||
# 无消率标注
|
||||
no_rate = no_cons[-1] / paid[-1] * 100 if paid[-1] else 0
|
||||
ax.text(0.97, 0.95, f'无课消率: {no_rate:.0f}%', transform=ax.transAxes,
|
||||
fontsize=11, ha='right', va='top', color='#999999', fontstyle='italic')
|
||||
|
||||
plt.tight_layout()
|
||||
path1 = f'{output_dir}/{key}_users_stack.png'
|
||||
plt.savefig(path1, dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {path1}')
|
||||
|
||||
# --- 图2: 折线图 ---
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
|
||||
ax.plot(xs, avg_all, 'o-', color='#999999', linewidth=2.2, markersize=5,
|
||||
label='周人均课消(全部付费用户)', linestyle='--', markerfacecolor='white')
|
||||
ax.plot(xs, avg_cons, 's-', color=color, linewidth=2.8, markersize=5,
|
||||
label='周有消人均课消', markerfacecolor='white')
|
||||
|
||||
# 填色区域
|
||||
ax.fill_between(xs, avg_all, avg_cons, alpha=0.08, color=color)
|
||||
|
||||
# 标注关键数据点
|
||||
for i in range(len(xs)):
|
||||
if i % max(1, len(data)//8) == 0:
|
||||
ax.annotate(f'{avg_all[i]:.1f}', (xs[i], avg_all[i]), textcoords='offset points',
|
||||
xytext=(0, -16), fontsize=7.5, color='#999999', ha='center')
|
||||
ax.annotate(f'{avg_cons[i]:.1f}', (xs[i], avg_cons[i]), textcoords='offset points',
|
||||
xytext=(0, 8), fontsize=7.5, color=color, ha='center', fontweight='bold')
|
||||
|
||||
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))
|
||||
ax.xaxis.set_major_locator(mdates.MonthLocator())
|
||||
plt.setp(ax.xaxis.get_majorticklabels(), rotation=45, fontsize=9)
|
||||
|
||||
ax.set_ylabel('课消数(节/周)', fontsize=13)
|
||||
ax.set_title(f'{label} 周人均课消趋势(剔除U0序章)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(True, alpha=0.3)
|
||||
ax.set_xlim(date(2025, 8, 30), date(2026, 5, 12))
|
||||
|
||||
plt.tight_layout()
|
||||
path2 = f'{output_dir}/{key}_avg_trend.png'
|
||||
plt.savefig(path2, dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {path2}')
|
||||
|
||||
print('\n全部 4 张图表已生成!')
|
||||
218
scripts/generate_charts_v3.py
Normal file
@ -0,0 +1,218 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
图表 v2:L1付费用户 = 仅L1 + L1+L2,L2付费用户 = 仅L2 + L1+L2
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
import matplotlib.pyplot as plt
|
||||
import matplotlib.dates as mdates
|
||||
import matplotlib.font_manager as fm
|
||||
import numpy as np
|
||||
|
||||
fm.fontManager.addfont('/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc')
|
||||
plt.rcParams['font.family'] = fm.FontProperties(fname='/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc').get_name()
|
||||
plt.rcParams['axes.unicode_minus'] = False
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
u0_chapters = {55, 56, 57, 58, 59, 343, 344, 345, 346, 348}
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
we = d + timedelta(days=6 - d.weekday())
|
||||
if we >= overall_end: we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
print("分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_levels = defaultdict(set)
|
||||
user_orders = defaultdict(list)
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_levels[aid].add(lt)
|
||||
user_orders[aid].append((pay_date.date(), is_refunded))
|
||||
|
||||
def is_paid(aid, as_of):
|
||||
return sum(1 for pd, ref in user_orders[aid] if pd <= as_of and not ref) > 0
|
||||
|
||||
# 分组:L1群 = 仅L1 + L1+L2;L2群 = 仅L2 + L1+L2
|
||||
l1_group = set() # 买了L1的所有用户
|
||||
l2_group = set() # 买了L2的所有用户
|
||||
for aid, levels in user_levels.items():
|
||||
has_l1 = 'L1' in levels or 'L1+L2' in levels
|
||||
has_l2 = 'L2' in levels or 'L1+L2' in levels
|
||||
if has_l1: l1_group.add(aid)
|
||||
if has_l2: l2_group.add(aid)
|
||||
|
||||
print(f"L1付费群: {len(l1_group)}人, L2付费群: {len(l2_group)}人, 重叠(L1+L2): {len(l1_group & l2_group)}人")
|
||||
|
||||
print("查询课消...")
|
||||
cons_map = {}
|
||||
for ti in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{ti}"
|
||||
cur.execute(f"""SELECT user_id, chapter_id, updated_at FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'""")
|
||||
for uid, cid, ua in cur.fetchall():
|
||||
if cid in u0_chapters: continue
|
||||
key = (uid, cid)
|
||||
d = ua.date() if hasattr(ua, 'date') else datetime.strptime(str(ua)[:10], '%Y-%m-%d').date()
|
||||
if key not in cons_map or d < cons_map[key]:
|
||||
cons_map[key] = d
|
||||
|
||||
print("角色映射...")
|
||||
all_uids = list(set(k[0] for k in cons_map))
|
||||
char2acct = {}
|
||||
for i in range(0, len(all_uids), 500):
|
||||
batch = all_uids[i:i+500]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall(): char2acct[cid] = aid
|
||||
|
||||
print("按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
# 截至 we 的付费用户
|
||||
l1_paid = {aid for aid in l1_group if is_paid(aid, we)}
|
||||
l2_paid = {aid for aid in l2_group if is_paid(aid, we)}
|
||||
|
||||
l1_cons, l1_cons_users = 0, set()
|
||||
l2_cons, l2_cons_users = 0, set()
|
||||
|
||||
for (uid, ch_id), cons_date in cons_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if not aid: continue
|
||||
if aid in l1_paid:
|
||||
l1_cons += 1
|
||||
l1_cons_users.add(aid)
|
||||
if aid in l2_paid:
|
||||
l2_cons += 1
|
||||
l2_cons_users.add(aid)
|
||||
|
||||
results.append({
|
||||
'ws': ws, 'we': we,
|
||||
'L1_paid': len(l1_paid), 'L1_cons': l1_cons, 'L1_cons_users': len(l1_cons_users),
|
||||
'L1_no_cons': len(l1_paid) - len(l1_cons_users),
|
||||
'L1_avg_all': round(l1_cons / len(l1_paid), 2) if l1_paid else 0,
|
||||
'L1_avg_cons': round(l1_cons / len(l1_cons_users), 2) if l1_cons_users else 0,
|
||||
'L2_paid': len(l2_paid), 'L2_cons': l2_cons, 'L2_cons_users': len(l2_cons_users),
|
||||
'L2_no_cons': len(l2_paid) - len(l2_cons_users),
|
||||
'L2_avg_all': round(l2_cons / len(l2_paid), 2) if l2_paid else 0,
|
||||
'L2_avg_cons': round(l2_cons / len(l2_cons_users), 2) if l2_cons_users else 0,
|
||||
})
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# ===== 生成图表 =====
|
||||
print("\n生成图表...")
|
||||
out = '/root/.openclaw/workspace/output'
|
||||
|
||||
configs = {
|
||||
'L1_all': {'prefix': 'L1', 'color': '#4A90D9', 'light': '#A8CFF1', 'label': 'L1'},
|
||||
'L2_all': {'prefix': 'L2', 'color': '#E85D47', 'light': '#F4A9A0', 'label': 'L2'},
|
||||
}
|
||||
|
||||
for key, cfg in configs.items():
|
||||
pfx = cfg['prefix']
|
||||
color = cfg['color']
|
||||
light = cfg['light']
|
||||
label = cfg['label']
|
||||
|
||||
first = next(i for i, r in enumerate(results) if r[f'{pfx}_paid'] > 0)
|
||||
data = results[first:]
|
||||
|
||||
xs = [r['ws'] + timedelta(days=3) for r in data]
|
||||
dates = [r['ws'] for r in data]
|
||||
paid = [r[f'{pfx}_paid'] for r in data]
|
||||
cons_users = [r[f'{pfx}_cons_users'] for r in data]
|
||||
no_cons = [r[f'{pfx}_no_cons'] for r in data]
|
||||
avg_all = [r[f'{pfx}_avg_all'] for r in data]
|
||||
avg_cons = [r[f'{pfx}_avg_cons'] for r in data]
|
||||
|
||||
# 图1: 堆叠柱状
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
x_idx = np.arange(len(xs))
|
||||
bar_w = 0.65
|
||||
ax.bar(x_idx, cons_users, bar_w, color=light, label='有课消用户', zorder=3)
|
||||
ax.bar(x_idx, no_cons, bar_w, bottom=cons_users, color='#D0D0D0', label='无课消用户', zorder=3)
|
||||
|
||||
step = max(1, len(data)//10)
|
||||
for i in range(0, len(data), step):
|
||||
ax.annotate(str(paid[i]), (i, paid[i]), textcoords='offset points', xytext=(0, 5),
|
||||
fontsize=7.5, ha='center', color='#333333', fontweight='bold')
|
||||
|
||||
ax.set_xticks(x_idx[::step])
|
||||
ax.set_xticklabels([dates[i].strftime('%m/%d') for i in range(0, len(data), step)], fontsize=8.5, rotation=45)
|
||||
ax.set_ylabel('用户数', fontsize=13)
|
||||
ax.set_title(f'{label}付费用户周课消分布(剔除U0序章)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(axis='y', alpha=0.3, zorder=0)
|
||||
ax.set_xlim(-0.5, len(x_idx) - 0.5)
|
||||
|
||||
no_rate = no_cons[-1] / paid[-1] * 100 if paid[-1] else 0
|
||||
ax.text(0.97, 0.95, f'付费{paid[-1]}人 | 无课消率{no_rate:.0f}%', transform=ax.transAxes,
|
||||
fontsize=11, ha='right', va='top', color='#666666', fontstyle='italic')
|
||||
|
||||
plt.tight_layout()
|
||||
plt.savefig(f'{out}/{key}_users_stack.png', dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {key}_users_stack.png')
|
||||
|
||||
# 图2: 折线
|
||||
fig, ax = plt.subplots(figsize=(18, 8))
|
||||
|
||||
ax.plot(xs, avg_all, 'o-', color='#999999', linewidth=2.2, markersize=5,
|
||||
label='人均课消(全部付费用户)', markerfacecolor='white')
|
||||
ax.plot(xs, avg_cons, 's-', color=color, linewidth=2.8, markersize=5,
|
||||
label='人均课消(有课消用户)', markerfacecolor='white')
|
||||
ax.fill_between(xs, avg_all, avg_cons, alpha=0.08, color=color)
|
||||
|
||||
for i in range(0, len(data), max(1, len(data)//8)):
|
||||
ax.annotate(f'{avg_all[i]:.1f}', (xs[i], avg_all[i]), textcoords='offset points',
|
||||
xytext=(0, -15), fontsize=7.5, color='#999999', ha='center')
|
||||
ax.annotate(f'{avg_cons[i]:.1f}', (xs[i], avg_cons[i]), textcoords='offset points',
|
||||
xytext=(0, 7), fontsize=7.5, color=color, ha='center', fontweight='bold')
|
||||
|
||||
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))
|
||||
ax.xaxis.set_major_locator(mdates.MonthLocator())
|
||||
plt.setp(ax.xaxis.get_majorticklabels(), rotation=45, fontsize=9)
|
||||
ax.set_ylabel('课消数(节/周)', fontsize=13)
|
||||
ax.set_title(f'{label}付费用户周人均课消趋势(剔除U0序章)', fontsize=16, fontweight='bold')
|
||||
ax.legend(fontsize=12, loc='upper left')
|
||||
ax.grid(True, alpha=0.3)
|
||||
ax.set_xlim(date(2025, 8, 30), date(2026, 5, 12))
|
||||
|
||||
plt.tight_layout()
|
||||
plt.savefig(f'{out}/{key}_avg_trend.png', dpi=150, bbox_inches='tight', facecolor='white')
|
||||
plt.close()
|
||||
print(f' ✅ {key}_avg_trend.png')
|
||||
|
||||
print('\n✅ 4张图表已生成')
|
||||
385
scripts/generate_excel.py
Normal file
@ -0,0 +1,385 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
生成课消指标 Excel:按周 + 按 L1/L2 拆分
|
||||
"""
|
||||
import psycopg2
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timedelta, date
|
||||
import openpyxl
|
||||
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
|
||||
from openpyxl.chart import LineChart, Reference
|
||||
from openpyxl.utils import get_column_letter
|
||||
from openpyxl.chart.label import DataLabelList
|
||||
from openpyxl.chart.series import DataPoint
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591, user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*", dbname="vala_bi"
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# ===== 时间参数 =====
|
||||
overall_start = date(2025, 9, 1)
|
||||
overall_end = date(2026, 5, 11)
|
||||
|
||||
weeks = []
|
||||
d = overall_start
|
||||
while d < overall_end:
|
||||
ws = d
|
||||
days_to_sunday = 6 - d.weekday()
|
||||
we = d + timedelta(days=days_to_sunday)
|
||||
if we >= overall_end:
|
||||
we = overall_end - timedelta(days=1)
|
||||
weeks.append((ws, we))
|
||||
d = we + timedelta(days=1)
|
||||
|
||||
# ===== Step 1: 用户分类 =====
|
||||
print("Step 1: 分类付费用户...")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.trade_no, o.order_status, o.pay_success_date,
|
||||
CASE WHEN o.goods_id IN (57, 60, 63) THEN 'L1'
|
||||
WHEN o.goods_id = 61 THEN 'L1+L2'
|
||||
WHEN o.goods_id IN (31, 32, 33, 54) THEN 'L2'
|
||||
ELSE '其他' END as level_type
|
||||
FROM bi_vala_order o
|
||||
INNER JOIN bi_vala_app_account a ON o.account_id = a.id
|
||||
WHERE a.status = 1 AND a.deleted_at IS NULL AND o.pay_success_date IS NOT NULL
|
||||
""")
|
||||
orders = cur.fetchall()
|
||||
print(f" 订单数: {len(orders)}")
|
||||
|
||||
cur.execute("SELECT trade_no FROM bi_refund_order WHERE status = 3")
|
||||
refund_trades = set(r[0] for r in cur.fetchall())
|
||||
|
||||
user_data = defaultdict(lambda: {'levels': set(), 'orders': []})
|
||||
for aid, trade_no, order_status, pay_date, lt in orders:
|
||||
is_refunded = (order_status == 4 and trade_no in refund_trades)
|
||||
user_data[aid]['levels'].add(lt)
|
||||
user_data[aid]['orders'].append((pay_date.date(), is_refunded, lt))
|
||||
|
||||
def classify_user(levels):
|
||||
has_l1, has_l2 = 'L1' in levels, 'L2' in levels
|
||||
return 'L1+L2' if ('L1+L2' in levels or (has_l1 and has_l2)) else ('仅L1' if has_l1 else ('仅L2' if has_l2 else '其他'))
|
||||
|
||||
for aid in user_data:
|
||||
user_data[aid]['category'] = classify_user(user_data[aid]['levels'])
|
||||
|
||||
def is_paid_as_of(aid, as_of_date):
|
||||
return sum(1 for pd, ref, lt in user_data[aid]['orders'] if pd <= as_of_date and not ref) > 0
|
||||
|
||||
# ===== Step 2: 课消 =====
|
||||
print("Step 2: 查询课消...")
|
||||
consumption_map = {}
|
||||
for table_idx in range(8):
|
||||
tbl = f"bi_user_chapter_play_record_{table_idx}"
|
||||
cur.execute(f"""
|
||||
SELECT user_id, chapter_id, updated_at
|
||||
FROM {tbl}
|
||||
WHERE play_status = 1 AND updated_at >= '2025-09-01' AND updated_at < '2026-05-11'
|
||||
""")
|
||||
for user_id, chapter_id, updated_at in cur.fetchall():
|
||||
key = (user_id, chapter_id)
|
||||
d = updated_at.date() if hasattr(updated_at, 'date') else datetime.strptime(str(updated_at)[:10], '%Y-%m-%d').date()
|
||||
if key not in consumption_map or d < consumption_map[key]:
|
||||
consumption_map[key] = d
|
||||
|
||||
print(f" 去重后: {len(consumption_map)} 条")
|
||||
|
||||
# ===== Step 3: 角色映射 =====
|
||||
print("Step 3: 角色映射...")
|
||||
all_uids = list(set(k[0] for k in consumption_map))
|
||||
char2acct = {}
|
||||
bs = 500
|
||||
for i in range(0, len(all_uids), bs):
|
||||
batch = all_uids[i:i+bs]
|
||||
ph = ','.join(['%s'] * len(batch))
|
||||
cur.execute(f"SELECT id, account_id FROM bi_vala_app_character WHERE id IN ({ph})", batch)
|
||||
for cid, aid in cur.fetchall():
|
||||
char2acct[cid] = aid
|
||||
print(f" 映射: {len(char2acct)}")
|
||||
|
||||
# ===== Step 4: 按周汇总 =====
|
||||
print("Step 4: 按周汇总...")
|
||||
results = []
|
||||
for ws, we in weeks:
|
||||
paid_by_cat = defaultdict(set)
|
||||
for aid in user_data:
|
||||
if is_paid_as_of(aid, we):
|
||||
paid_by_cat[user_data[aid]['category']].add(aid)
|
||||
|
||||
cons_by_cat = defaultdict(int)
|
||||
cons_users_by_cat = defaultdict(set)
|
||||
|
||||
for (uid, ch_id), cons_date in consumption_map.items():
|
||||
if ws <= cons_date <= we:
|
||||
aid = char2acct.get(uid)
|
||||
if aid:
|
||||
cat = user_data.get(aid, {}).get('category', '其他')
|
||||
if aid in paid_by_cat.get(cat, set()):
|
||||
cons_by_cat[cat] += 1
|
||||
cons_users_by_cat[cat].add(aid)
|
||||
|
||||
row = {'ws': ws, 'we': we}
|
||||
for cat in ['仅L1', '仅L2', 'L1+L2', '其他', '合计']:
|
||||
if cat == '合计':
|
||||
n_paid = sum(len(v) for v in paid_by_cat.values())
|
||||
n_cons = sum(cons_by_cat.values())
|
||||
n_cons_users = len(set.union(*cons_users_by_cat.values())) if cons_users_by_cat else 0
|
||||
else:
|
||||
n_paid = len(paid_by_cat.get(cat, set()))
|
||||
n_cons = cons_by_cat.get(cat, 0)
|
||||
n_cons_users = len(cons_users_by_cat.get(cat, set()))
|
||||
|
||||
row[f'{cat}_paid'] = n_paid
|
||||
row[f'{cat}_cons'] = n_cons
|
||||
row[f'{cat}_cons_users'] = n_cons_users
|
||||
row[f'{cat}_avg_all'] = round(n_cons / n_paid, 2) if n_paid > 0 else 0
|
||||
row[f'{cat}_avg_cons'] = round(n_cons / n_cons_users, 2) if n_cons_users > 0 else 0
|
||||
|
||||
results.append(row)
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# ===== 生成 Excel =====
|
||||
print("\n生成 Excel...")
|
||||
wb = openpyxl.Workbook()
|
||||
|
||||
# 样式
|
||||
header_font = Font(name='微软雅黑', bold=True, size=10, color='FFFFFF')
|
||||
header_fill = PatternFill(start_color='2F5496', end_color='2F5496', fill_type='solid')
|
||||
data_font = Font(name='微软雅黑', size=10)
|
||||
title_font = Font(name='微软雅黑', bold=True, size=14, color='2F5496')
|
||||
subtitle_font = Font(name='微软雅黑', bold=True, size=11, color='2F5496')
|
||||
border = Border(left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin'))
|
||||
center = Alignment(horizontal='center', vertical='center')
|
||||
|
||||
l1_fill = PatternFill(start_color='DAEEF3', end_color='DAEEF3', fill_type='solid')
|
||||
l2_fill = PatternFill(start_color='FDE9D9', end_color='FDE9D9', fill_type='solid')
|
||||
l1l2_fill = PatternFill(start_color='E4DFEC', end_color='E4DFEC', fill_type='solid')
|
||||
total_fill = PatternFill(start_color='D9EAD3', end_color='D9EAD3', fill_type='solid')
|
||||
|
||||
def apply_cell(ws, row, col, value, font=data_font, fill=None, border_style=border, align=center):
|
||||
c = ws.cell(row=row, column=col, value=value)
|
||||
c.font, c.border, c.alignment = font, border_style, align
|
||||
if fill: c.fill = fill
|
||||
return c
|
||||
|
||||
def apply_header(ws, row, col, value):
|
||||
c = ws.cell(row=row, column=col, value=value)
|
||||
c.font, c.fill, c.border, c.alignment = header_font, header_fill, border, center
|
||||
return c
|
||||
|
||||
# ===== Sheet 1: 概览 =====
|
||||
ws1 = wb.active
|
||||
ws1.title = "概览"
|
||||
ws1.merge_cells('A1:G1')
|
||||
apply_cell(ws1, 1, 1, "付费用户 L1/L2 课消分析", font=title_font, border_style=Border(), align=Alignment(horizontal='left'))
|
||||
ws1.merge_cells('A2:G2')
|
||||
apply_cell(ws1, 2, 1, f"数据区间: 2025-09-01 ~ 2026-05-10 | 更新日期: 2026-05-14", font=Font(name='微软雅黑', size=9, color='666666'), border_style=Border(), align=Alignment(horizontal='left'))
|
||||
|
||||
# 口径说明
|
||||
notes = [
|
||||
"口径说明:",
|
||||
"• 课消:用户首次完成某一课时(play_status=1,按(user_id,chapter_id)取最早updated_at)",
|
||||
"• L1商品: goods_id IN (57,60,63) | L2商品: goods_id IN (31,32,33,54) | L1+L2商品: goods_id=61",
|
||||
"• 付费用户:status=1 + deleted_at IS NULL + 有订单 + 未全部退款",
|
||||
"• 人均课消 = 周内课消次数 / 付费用户数",
|
||||
"• 有消用户人均 = 周内课消次数 / 至少完成1次课消的付费用户数",
|
||||
]
|
||||
for i, note in enumerate(notes):
|
||||
apply_cell(ws1, 4+i, 1, note, font=Font(name='微软雅黑', size=9), border_style=Border(), align=Alignment(horizontal='left'))
|
||||
|
||||
# 汇总表
|
||||
row = 11
|
||||
ws1.merge_cells(f'A{row}:K{row}')
|
||||
apply_cell(ws1, row, 1, "付费用户分类(截至最后一周)", font=subtitle_font, border_style=Border(), align=Alignment(horizontal='left'))
|
||||
row += 1
|
||||
|
||||
headers_summary = ['分类', '付费用户数', '占比']
|
||||
for j, h in enumerate(headers_summary, 1):
|
||||
apply_header(ws1, row, j, h)
|
||||
row += 1
|
||||
|
||||
last = results[-1]
|
||||
cats_data = [('仅L1', last['仅L1_paid']), ('仅L2', last['仅L2_paid']), ('L1+L2', last['L1+L2_paid'])]
|
||||
total = sum(v for _, v in cats_data)
|
||||
for cat, v in cats_data:
|
||||
apply_cell(ws1, row, 1, cat)
|
||||
apply_cell(ws1, row, 2, v)
|
||||
apply_cell(ws1, row, 3, f"{v/total*100:.1f}%")
|
||||
if '仅L1' in cat: fill = l1_fill
|
||||
elif '仅L2' in cat: fill = l2_fill
|
||||
else: fill = l1l2_fill
|
||||
for c in range(1, 4): ws1.cell(row=row, column=c).fill = fill
|
||||
row += 1
|
||||
|
||||
apply_cell(ws1, row, 1, '合计', font=Font(name='微软雅黑', bold=True, size=10))
|
||||
apply_cell(ws1, row, 2, total, font=Font(name='微软雅黑', bold=True, size=10))
|
||||
apply_cell(ws1, row, 3, '100%', font=Font(name='微软雅黑', bold=True, size=10))
|
||||
for c in range(1, 4): ws1.cell(row=row, column=c).fill = total_fill
|
||||
|
||||
# 近期趋势摘要
|
||||
row += 2
|
||||
ws1.merge_cells(f'A{row}:K{row}')
|
||||
apply_cell(ws1, row, 1, "近期人均课消趋势", font=subtitle_font, border_style=Border(), align=Alignment(horizontal='left'))
|
||||
row += 1
|
||||
|
||||
trend_headers = ['周', '合计人均', '仅L1人均', '仅L2人均', 'L1+L2人均', '合计有消人均', '仅L1有消人均', '仅L2有消人均', 'L1+L2有消人均']
|
||||
for j, h in enumerate(trend_headers, 1):
|
||||
apply_header(ws1, row, j, h)
|
||||
row += 1
|
||||
|
||||
for r in results[-8:]: # 最近8周
|
||||
wl = f"{r['ws'].strftime('%m/%d')}-{r['we'].strftime('%m/%d')}"
|
||||
apply_cell(ws1, row, 1, wl, font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 2, r['合计_avg_all'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 3, r['仅L1_avg_all'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 4, r['仅L2_avg_all'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 5, r['L1+L2_avg_all'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 6, r['合计_avg_cons'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 7, r['仅L1_avg_cons'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 8, r['仅L2_avg_cons'], font=Font(name='微软雅黑', size=9))
|
||||
apply_cell(ws1, row, 9, r['L1+L2_avg_cons'], font=Font(name='微软雅黑', size=9))
|
||||
row += 1
|
||||
|
||||
# 列宽
|
||||
for col in range(1, 10):
|
||||
ws1.column_dimensions[get_column_letter(col)].width = 14
|
||||
|
||||
# ===== Sheet 2: 明细 =====
|
||||
ws2 = wb.create_sheet("每周明细")
|
||||
|
||||
# 标题行
|
||||
row2 = 1
|
||||
# 第一部分:付费用户数
|
||||
group_headers = [
|
||||
('付费用户数', ['合计', '仅L1', '仅L2', 'L1+L2']),
|
||||
('课消次数', ['合计', '仅L1', '仅L2', 'L1+L2']),
|
||||
('有课消用户数', ['合计', '仅L1', '仅L2', 'L1+L2']),
|
||||
('人均课消(全部付费用户)', ['合计', '仅L1', '仅L2', 'L1+L2']),
|
||||
('人均课消(有课消用户)', ['合计', '仅L1', '仅L2', 'L1+L2']),
|
||||
]
|
||||
|
||||
apply_header(ws2, row2, 1, '周')
|
||||
apply_header(ws2, row2, 2, '周一起')
|
||||
apply_header(ws2, row2, 3, '周日')
|
||||
col = 4
|
||||
spans = []
|
||||
for grp_name, cols in group_headers:
|
||||
start_col = col
|
||||
for _ in cols:
|
||||
col += 1
|
||||
end_col = col - 1
|
||||
if start_col < end_col:
|
||||
ws2.merge_cells(start_row=row2, start_column=start_col, end_row=row2, end_column=end_col)
|
||||
apply_header(ws2, row2, start_col, grp_name)
|
||||
spans.append((start_col, end_col, grp_name, cols))
|
||||
for ic, cname in enumerate(cols):
|
||||
apply_header(ws2, row2+1, start_col+ic, cname)
|
||||
col_count = col - 1
|
||||
|
||||
# 数据
|
||||
row2 = 3
|
||||
for r in results:
|
||||
wl = f"{r['ws'].strftime('%m/%d')}-{r['we'].strftime('%m/%d')}"
|
||||
apply_cell(ws2, row2, 1, wl)
|
||||
apply_cell(ws2, row2, 2, r['ws'].strftime('%Y-%m-%d'))
|
||||
apply_cell(ws2, row2, 3, r['we'].strftime('%Y-%m-%d'))
|
||||
col = 4
|
||||
for grp_name, cols in group_headers:
|
||||
for cname in cols:
|
||||
key_map = {
|
||||
'付费用户数': f"{cname}_paid",
|
||||
'课消次数': f"{cname}_cons",
|
||||
'有课消用户数': f"{cname}_cons_users",
|
||||
'人均课消(全部付费用户)': f"{cname}_avg_all",
|
||||
'人均课消(有课消用户)': f"{cname}_avg_cons",
|
||||
}
|
||||
val = r[key_map[grp_name]]
|
||||
apply_cell(ws2, row2, col, val)
|
||||
col += 1
|
||||
row2 += 1
|
||||
|
||||
# 列宽
|
||||
ws2.column_dimensions['A'].width = 14
|
||||
ws2.column_dimensions['B'].width = 12
|
||||
ws2.column_dimensions['C'].width = 12
|
||||
for ci in range(4, col_count + 1):
|
||||
ws2.column_dimensions[get_column_letter(ci)].width = 10
|
||||
|
||||
# 冻结首3列+标题
|
||||
ws2.freeze_panes = 'D4'
|
||||
|
||||
# ===== 图表 =====
|
||||
chart_sheet = wb.create_sheet("图表")
|
||||
|
||||
# Chart 1: 人均课消趋势(按类别)
|
||||
chart1 = LineChart()
|
||||
chart1.title = "人均课消数(全部付费用户)"
|
||||
chart1.style = 10
|
||||
chart1.y_axis.title = "课消数(节/周)"
|
||||
chart1.x_axis.title = None
|
||||
chart1.width = 28
|
||||
chart1.height = 14
|
||||
chart1.y_axis.scaling.min = 0
|
||||
|
||||
data_row_start = 3
|
||||
data_row_end = row2 - 1
|
||||
|
||||
# Categories (周标签)
|
||||
cats_ref = Reference(ws2, min_col=1, min_row=data_row_start, max_row=data_row_end)
|
||||
|
||||
# 各系列列号(人均课消 - 全部付费用户 section)
|
||||
# 合计: col 16, 仅L1: col 17, 仅L2: col 18, L1+L2: col 19
|
||||
# 需要先确定列号
|
||||
header_row = 2
|
||||
grp_col_map = {}
|
||||
col = 4
|
||||
for grp_name, cols in group_headers:
|
||||
grp_col_map[grp_name] = col
|
||||
col += len(cols)
|
||||
|
||||
# 人均课消(全部): group 4, 从 grp_col_map['人均课消(全部付费用户)']
|
||||
start_avg = grp_col_map['人均课消(全部付费用户)']
|
||||
colors = ['333333', '4A90D9', 'E85D47', '7B9E4B']
|
||||
labels = ['合计', '仅L1', '仅L2', 'L1+L2']
|
||||
for i in range(4):
|
||||
ref = Reference(ws2, min_col=start_avg+i, min_row=data_row_start-1, max_row=data_row_end) # -1 for header in row2
|
||||
chart1.add_data(ref, titles_from_data=True)
|
||||
chart1.set_categories(cats_ref)
|
||||
s = chart1.series[i]
|
||||
s.graphicalProperties.line.solidFill = colors[i]
|
||||
s.graphicalProperties.line.width = 25000 if i == 0 else 20000
|
||||
if i > 0:
|
||||
s.graphicalProperties.line.dashStyle = 'solid'
|
||||
|
||||
chart_sheet.add_chart(chart1, "A1")
|
||||
|
||||
# Chart 2: 付费用户数增长
|
||||
chart2 = LineChart()
|
||||
chart2.title = "付费用户数增长趋势"
|
||||
chart2.style = 10
|
||||
chart2.y_axis.title = "用户数"
|
||||
chart2.width = 28
|
||||
chart2.height = 14
|
||||
|
||||
start_paid = grp_col_map['付费用户数']
|
||||
for i in range(4):
|
||||
ref = Reference(ws2, min_col=start_paid+i, min_row=data_row_start-1, max_row=data_row_end)
|
||||
chart2.add_data(ref, titles_from_data=True)
|
||||
chart2.set_categories(cats_ref)
|
||||
s = chart2.series[i]
|
||||
s.graphicalProperties.line.solidFill = colors[i]
|
||||
s.graphicalProperties.line.width = 25000 if i == 0 else 20000
|
||||
|
||||
chart_sheet.add_chart(chart2, "A18")
|
||||
|
||||
# ===== 保存 =====
|
||||
path = '/root/.openclaw/workspace/output/course_consumption_by_level.xlsx'
|
||||
wb.save(path)
|
||||
print(f"\n✅ Excel 已保存: {path}")
|
||||
print(f" Sheet 1: 概览(口径说明 + 近期趋势)")
|
||||
print(f" Sheet 2: 每周明细(36周完整数据)")
|
||||
print(f" Sheet 3: 图表(人均课消趋势 + 付费用户增长)")
|
||||