🤖 每日自动备份 - 2026-05-09 08:00:01
This commit is contained in:
parent
51d7d39d2c
commit
e4fb55d987
41
memory/2026-05-08-0958.md
Normal file
41
memory/2026-05-08-0958.md
Normal file
@ -0,0 +1,41 @@
|
||||
# Session: 2026-05-08 09:58:04 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: 15b373a7-36d0-4905-9c39-6e577ce079cd
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
assistant: 承龙你好,我在。上次线索用户分析的后续问题我已经注意到了——你说只有姜小龙有成交订单,怀疑数据有问题。要我现在重新跑一下那份 Excel 的数据来核实吗?还是你有新的文件或者其他问题要处理?
|
||||
user: [media attached: /root/.openclaw/media/inbound/3æ_è_çº_çº_ç_-2---fcc3c69c-362d-46d3-82de-aa308a04078e.xlsx (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) | /root/.openclaw/media/inbound/3æ_è_çº_çº_ç_-2---fcc3c69c-362d-46d3-82de-aa308a04078e.xlsx]
|
||||
[File: /root/.openclaw/media/inbound/3æ_è_çº_çº_ç_-2---fcc3c69c-362d-46d3-82de-aa308a04078e.xlsx]
|
||||
assistant: 收到文件了,让我先读取看看内容。
|
||||
assistant: 明白了,文件内容是 **532 条线索记录**,包含三列:
|
||||
|
||||
- **用户ID**:线索用户 ID
|
||||
- **线索进线日期**:2026年3月1日 ~ 3月31日
|
||||
- **销售**:姜小龙、Bob、Tom 三位销售
|
||||
|
||||
你要我做什么分析呢?比如:
|
||||
|
||||
1. **按销售统计线索数量**(姜小龙/Bob/Tom 各分配了多少线索)
|
||||
2. **查询这些线索用户的成交情况**(哪些线索转化成了付费订单?GMV/GSV 是多少?)
|
||||
3. **将线索数据与订单数据关联,生成完整报表**
|
||||
|
||||
请告诉我具体需求。
|
||||
user: 用lead_user_analysis.py这个脚本处理一下上面的excel
|
||||
assistant: 让我先找到这个脚本。
|
||||
assistant: 脚本逻辑清晰,现在用它处理你刚发过来的文件。
|
||||
assistant: 脚本执行成功。来看下结果概览:
|
||||
assistant: 报表已生成。来看看关键数据汇总:
|
||||
|
||||
**总体:** 532 个线索 → 531 个有效账户,628 个角色,其中 141 行有购买记录、23 行有退款
|
||||
|
||||
| 销售 | 线索用户数 | 角色数 | 有购买行 | 退款行 |
|
||||
|------|-----------|--------|---------|--------|
|
||||
| 姜小龙 | 181 | 221 | 221 | 10 |
|
||||
| Bob | 166 | 186 | 186 | 5 |
|
||||
| Tom | 185 | 221 | 221 | 8 |
|
||||
|
||||
三位销售的线索分配和购买行数都差不多。现在把文件发给你:
|
||||
assistant: 报表已发出。你看一下数据——三位销售的线索分配、购买转化、退款分布基本均衡,这次三个销售都有成交了,不像上次只有姜小龙有订单。有需要调整的地方随时说。
|
||||
32
memory/2026-05-08-1035.md
Normal file
32
memory/2026-05-08-1035.md
Normal file
@ -0,0 +1,32 @@
|
||||
# Session: 2026-05-08 10:35:24 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: f5e585ad-8ada-45fe-a0fa-4dfef0f57c77
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
assistant: 内容很大(6000+行),直接派两个子任务分别执行覆写。
|
||||
assistant: 两个转录任务已并行启动,等待完成。
|
||||
assistant: 第一组转录(数据字典)已完成。等待第二组(数据地图)…
|
||||
assistant: 第二组转录子任务似乎中断了。我来直接在主线处理——先读源内容,再覆写目标文档。
|
||||
assistant: 文件很大(6000+行),我需要再起一个子任务来执行覆写。之前的子任务可能因工具调用问题中断了。
|
||||
assistant: 已重新启动数据地图覆写任务,等待完成。
|
||||
assistant: 明白了——子任务没有飞书工具权限,必须由主线调用。文件有 6012 行,我分块读取后直接在主线覆写。
|
||||
assistant: 文件太大难在主会话直接传参。我用 exec 通过 lark-cli 直接调用飞书 API 覆写。
|
||||
assistant: 可以通过 stdin 传入文件内容。执行覆写:
|
||||
assistant: 需要先绑定 lark-cli 到 OpenClaw:
|
||||
assistant: 绑定成功,重新执行覆写:
|
||||
assistant: 任务正在异步处理中,查询状态:
|
||||
assistant: 等待异步任务完成,现在验证两者是否都已成功。
|
||||
assistant: 两个转录任务全部完成。验证确认如下:
|
||||
|
||||
| # | 来源 (docx) | → | 目标 (wiki) | 状态 |
|
||||
|---|-----------|----|----------|------|
|
||||
| 1 | 数据字典 (38KB) | → | [数据字典](https://makee-interactive.feishu.cn/wiki/TlJlwMTdkiP49ukOC1WcrY1Pn0g) | ✅ 已更新 |
|
||||
| 2 | 数据地图 (104KB) | → | [数据地图](https://makee-interactive.feishu.cn/wiki/Bs8JwPbfqijQgfkz6VqcKtagnKd) | ✅ 已更新 |
|
||||
|
||||
两个知识库文档标题和内容与源文档完全一致:
|
||||
- **数据字典**:涵盖增长指标、订单收入、销售渠道、课程体系、学习指标、附录速查等 8 章
|
||||
- **数据地图**:涵盖 PostgreSQL/MySQL 全部核心表字段定义、SQL 模板、渠道映射等
|
||||
assistant: 临时文件已清理。承龙,转录全部完成,可以直接在知识库中查看和使用这两份文档了。
|
||||
107
memory/2026-05-08-1110.md
Normal file
107
memory/2026-05-08-1110.md
Normal file
@ -0,0 +1,107 @@
|
||||
# Session: 2026-05-08 11:10:51 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: f38b476c-1c19-4f97-b19f-80cd61957723
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
user: 我想看一个纯净版的转化率,分子是keyfrom为app-active-h5-0-0或者app-sales-bj-qhm-0的订单的用户数(剔除测试账号、删除了的账号),分母是全部注册用户数剔除keyfrom不为app-active-h5-0-0或者app-sales-bj-qhm-0的订单的用户数(剔除测试账号、删除了的账号)。按照月份统计,分别计算分子包含退费订单和不包含退费订单的版本,分母一直用包含退费订单的数。
|
||||
assistant: 李承龙,需求已理解。先确认几个关键点再动手。
|
||||
|
||||
让我先查下相关表结构,确认"已删除账号"的判断字段和注册时间字段。
|
||||
assistant: 好的,表结构已确认。让我先验证数据量,再执行完整查询。
|
||||
assistant: 数据量不大,来执行完整查询。
|
||||
assistant: 查询完成。让我再做个完整性校验,确认各环节数据合理。
|
||||
assistant: 数据校验通过,逻辑自洽。以下是结果:
|
||||
|
||||
## 纯净版端内转化率(按月统计)
|
||||
|
||||
| 月份 | 纯净注册用户 | 端内付费(含退费) | 端内付费(不含退费) | 转化率(含退费) | 转化率(不含退费) |
|
||||
|------|:----------:|:----------:|:----------:|:--------:|:--------:|
|
||||
| 2025-03 | 136 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-04 | 73 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-05 | 46 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-06 | 557 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-07 | 157 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-08 | 351 | 0 | 0 | 0.00% | 0.00% |
|
||||
| 2025-09 | 1,164 | 16 | 16 | 1.37% | 1.37% |
|
||||
| 2025-10 | 1,992 | 31 | 30 | 1.56% | 1.51% |
|
||||
| 2025-11 | 2,738 | 21 | 21 | 0.77% | 0.77% |
|
||||
| 2025-12 | 3,306 | 23 | 22 | 0.70% | 0.67% |
|
||||
| 2026-01 | 1,785 | 27 | 24 | 1.51% | 1.34% |
|
||||
| 2026-02 | 1,480 | 15 | 14 | 1.01% | 0.95% |
|
||||
| 2026-03 | 3,556 | 76 | 76 | 2.14% | 2.14% |
|
||||
| 2026-04 | 4,283 | 61 | 60 | 1.42% | 1.40% |
|
||||
| 2026-05 | 472 | 5 | 5 | 1.06% | 1.06% |
|
||||
|
||||
---
|
||||
|
||||
### 口径说明
|
||||
|
||||
**分母(纯净注册用户):**
|
||||
- bi_vala_app_account 中 status=1 且 deleted_at IS NULL 的用户
|
||||
- 排除在 bi_vala_order 中有过端外订单(key_from ≠ `app-active-h5-0-0` 或 `app-sales-bj-qhm-0`)的用户
|
||||
- 按用户注册月份分组
|
||||
|
||||
**分子:**
|
||||
- **含退费版本:** 分母用户中,在 bi_vala_order 有过端内订单的用户
|
||||
- **不含退费版本:** 上述用户中排除「端内订单全部退费」的用户(退费判定:bi_refund_order.status=3 且 bi_vala_order.order_status=4)
|
||||
|
||||
**补充数据:** 全部端内订单 324 笔 / 297 个用户,其中退费 31 笔 / 21 个用户,16 人端内订单全部退费。
|
||||
|
||||
### 关键观察
|
||||
|
||||
1. **端内付费从 2025-09 才开始出现**,此前 3-8 月注册的纯净用户转化率为 0
|
||||
2. **2026-03 转化率最高**(2.14%),当月注册用户量大(3556),端内付费用户也最多(76人)
|
||||
3. **退费对转化率影响较小**(差异 0.01-0.17pp),仅 16 人全部退费
|
||||
4. **含退费 vs 不含退费两条线高度接近**,说明端内用户的退费率在用户维度上并不高
|
||||
|
||||
需要导出 Excel 或进一步分析某个维度吗?
|
||||
user: 再帮我加上一个GMV和GSV吧
|
||||
assistant: 结果出来了,整理如下:
|
||||
|
||||
## 纯净版端内转化率 + GMV/GSV(按月,按注册月份)
|
||||
|
||||
| 月份 | 纯净注册用户 | 含退费 | | 不含退费 | |
|
||||
|------|:----------:|:------:|:------:|:------:|:------:|
|
||||
| | | 付费用户(含退费) | 转化率 | 付费用户(不含退费) | 转化率 |
|
||||
| 2025-03 | 136 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-04 | 73 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-05 | 46 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-06 | 557 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-07 | 157 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-08 | 351 | 0 | 0.00% | 0 | 0.00% |
|
||||
| 2025-09 | 1,164 | 16 | 1.37% | 16 | 1.37% |
|
||||
| 2025-10 | 1,992 | 31 | 1.56% | 30 | 1.51% |
|
||||
| 2025-11 | 2,738 | 21 | 0.77% | 21 | 0.77% |
|
||||
| 2025-12 | 3,306 | 23 | 0.70% | 22 | 0.67% |
|
||||
| 2026-01 | 1,785 | 27 | 1.51% | 24 | 1.34% |
|
||||
| 2026-02 | 1,480 | 15 | 1.01% | 14 | 0.95% |
|
||||
| 2026-03 | 3,556 | 76 | 2.14% | 76 | 2.14% |
|
||||
| 2026-04 | 4,283 | 61 | 1.42% | 60 | 1.40% |
|
||||
| 2026-05 | 472 | 5 | 1.06% | 5 | 1.06% |
|
||||
|
||||
| 月份 | GMV(含退费) | GSV(含退费) | GMV(不含退费) | GSV(不含退费) |
|
||||
|------|:----------:|:----------:|:----------:|:----------:|
|
||||
| 2025-09 | ¥19,983 | ¥19,384 | ¥19,384 | ¥19,384 |
|
||||
| 2025-10 | ¥38,867 | ¥36,868 | ¥36,868 | ¥36,868 |
|
||||
| 2025-11 | ¥26,579 | ¥26,579 | ¥26,579 | ¥26,579 |
|
||||
| 2025-12 | ¥28,574 | ¥27,376 | ¥27,376 | ¥27,376 |
|
||||
| 2026-01 | ¥34,869 | ¥31,073 | ¥31,073 | ¥31,073 |
|
||||
| 2026-02 | ¥19,983 | ¥17,984 | ¥17,984 | ¥17,984 |
|
||||
| 2026-03 | ¥110,415 | ¥106,917 | ¥106,917 | ¥106,917 |
|
||||
| 2026-04 | ¥78,335 | ¥75,737 | ¥75,737 | ¥75,737 |
|
||||
| 2026-05 | ¥4,395 | ¥4,395 | ¥4,395 | ¥4,395 |
|
||||
| **合计** | **¥362,000** | **¥346,313** | **¥346,313** | **¥346,313** |
|
||||
|
||||
> 2025-03 ~ 2025-08 无端内付费,GMV/GSV 均为 0,表中省略。
|
||||
|
||||
---
|
||||
|
||||
### 口径小结
|
||||
|
||||
- **GMV(含退费)**:纯净用户池中端内付费用户的所有端内订单金额,不剔除退费
|
||||
- **GSV(含退费)**:GMV(含退费) - 退费金额
|
||||
- **GMV/GSV(不含退费)**:排除退费订单 + 排除全部退费用户后的端内订单金额(此时 GMV=GSV,因为已无退费订单)
|
||||
- 含退费版本的总退费金额:¥362,000 - ¥346,313 = **¥15,687**(占比 4.3%)
|
||||
255
scripts/daily_reg_analysis.py
Normal file
255
scripts/daily_reg_analysis.py
Normal file
@ -0,0 +1,255 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
2026年1-4月每日新增注册人数回归分析(V4 - 比例潮汐+保底)
|
||||
改进点:
|
||||
1. 星期潮汐效应改为比例补偿:周末效应 = 趋势 × (1 + 比例),而非趋势 + 固定值
|
||||
理由:1月基线50人时+11是+22%,4月基线120人时+11只+9%,固定值会随趋势增长等比稀释
|
||||
2. 实际值低于拟合值时保留实际值(活动日才用拟合值替换)
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
|
||||
# ==================== 全量数据 ====================
|
||||
raw_data = {
|
||||
'2026-01-01': 78, '2026-01-02': 74, '2026-01-03': 69, '2026-01-04': 44,
|
||||
'2026-01-05': 56, '2026-01-06': 33, '2026-01-07': 52, '2026-01-08': 60,
|
||||
'2026-01-09': 58, '2026-01-10': 84, '2026-01-11': 75, '2026-01-12': 35,
|
||||
'2026-01-13': 46, '2026-01-14': 59, '2026-01-15': 32, '2026-01-16': 37,
|
||||
'2026-01-17': 67, '2026-01-18': 71, '2026-01-19': 53, '2026-01-20': 49,
|
||||
'2026-01-21': 42, '2026-01-22': 63, '2026-01-23': 48, '2026-01-24': 72,
|
||||
'2026-01-25': 86, '2026-01-26': 62, '2026-01-27': 58, '2026-01-28': 150,
|
||||
'2026-01-29': 106, '2026-01-30': 88, '2026-01-31': 61,
|
||||
'2026-02-01': 55, '2026-02-02': 53, '2026-02-03': 42, '2026-02-04': 39,
|
||||
'2026-02-05': 42, '2026-02-06': 55, '2026-02-07': 37, '2026-02-08': 47,
|
||||
'2026-02-09': 42, '2026-02-10': 60, '2026-02-11': 265, '2026-02-12': 59,
|
||||
'2026-02-13': 42, '2026-02-14': 32, '2026-02-15': 41, '2026-02-16': 28,
|
||||
'2026-02-17': 52, '2026-02-18': 23, '2026-02-19': 35, '2026-02-20': 26,
|
||||
'2026-02-21': 37, '2026-02-22': 36, '2026-02-23': 47, '2026-02-24': 45,
|
||||
'2026-02-25': 64, '2026-02-26': 182, '2026-02-27': 102, '2026-02-28': 168,
|
||||
'2026-03-01': 126, '2026-03-02': 108, '2026-03-03': 78, '2026-03-04': 111,
|
||||
'2026-03-05': 453, '2026-03-06': 218, '2026-03-07': 241, '2026-03-08': 207,
|
||||
'2026-03-09': 168, '2026-03-10': 109, '2026-03-11': 135, '2026-03-12': 178,
|
||||
'2026-03-13': 245, '2026-03-14': 131, '2026-03-15': 132, '2026-03-16': 110,
|
||||
'2026-03-17': 96, '2026-03-18': 96, '2026-03-19': 78, '2026-03-20': 103,
|
||||
'2026-03-21': 133, '2026-03-22': 126, '2026-03-23': 85, '2026-03-24': 80,
|
||||
'2026-03-25': 94, '2026-03-26': 85, '2026-03-27': 95, '2026-03-28': 105,
|
||||
'2026-03-29': 101, '2026-03-30': 82, '2026-03-31': 100,
|
||||
'2026-04-01': 138, '2026-04-02': 165, '2026-04-03': 779, '2026-04-04': 341,
|
||||
'2026-04-05': 186, '2026-04-06': 207, '2026-04-07': 212, '2026-04-08': 751,
|
||||
'2026-04-09': 341, '2026-04-10': 129, '2026-04-11': 174, '2026-04-12': 151,
|
||||
'2026-04-13': 118, '2026-04-14': 133, '2026-04-15': 126, '2026-04-16': 115,
|
||||
'2026-04-17': 84, '2026-04-18': 117, '2026-04-19': 120, '2026-04-20': 89,
|
||||
'2026-04-21': 100, '2026-04-22': 179, '2026-04-23': 139, '2026-04-24': 122,
|
||||
'2026-04-25': 142, '2026-04-26': 137, '2026-04-27': 120, '2026-04-28': 164,
|
||||
'2026-04-29': 65, '2026-04-30': 58,
|
||||
}
|
||||
|
||||
dates = sorted(raw_data.keys())
|
||||
date_series = pd.to_datetime(dates)
|
||||
values = np.array([raw_data[d] for d in dates], dtype=float)
|
||||
n = len(values)
|
||||
x = np.arange(n, dtype=float)
|
||||
dow = np.array([d.weekday() for d in date_series]) # 0=Mon, 6=Sun
|
||||
|
||||
# ==================== 活动标记 ====================
|
||||
activity_events = {
|
||||
'2026-01-28': ('活动', 1),
|
||||
'2026-02-11': ('活动', 0),
|
||||
'2026-02-26': ('活动', 4),
|
||||
'2026-03-05': ('活动', 3),
|
||||
'2026-03-09': ('活动', 0),
|
||||
'2026-03-12': ('活动', 0),
|
||||
'2026-03-13': ('活动', 0),
|
||||
'2026-04-03': ('大型活动', 4),
|
||||
'2026-04-08': ('大型活动', 2),
|
||||
'2026-04-22': ('小活动', 1),
|
||||
'2026-04-28': ('小活动', 0),
|
||||
}
|
||||
|
||||
affected_reason = {}
|
||||
for d_str, (reason, aftereffect_days) in activity_events.items():
|
||||
base_idx = dates.index(d_str)
|
||||
affected_reason[d_str] = reason
|
||||
for offset in range(1, aftereffect_days + 1):
|
||||
idx = base_idx + offset
|
||||
if idx < n:
|
||||
ae_date = dates[idx]
|
||||
if ae_date not in affected_reason:
|
||||
affected_reason[ae_date] = f'{dates[base_idx]} 余波'
|
||||
|
||||
is_affected = np.array([d in affected_reason for d in dates])
|
||||
|
||||
# ==================== LOESS ====================
|
||||
def loess_smooth(x_data, y_data, x_eval, frac=0.28, deg=2):
|
||||
n_eval = len(x_eval)
|
||||
n_data = len(x_data)
|
||||
y_smooth = np.zeros(n_eval)
|
||||
for j in range(n_eval):
|
||||
x0 = x_eval[j]; dist = np.abs(x_data - x0)
|
||||
k = max(int(frac * n_data), deg + 2); k = min(k, n_data)
|
||||
idx = np.argpartition(dist, k)[:k]
|
||||
x_nn, y_nn, d_nn = x_data[idx], y_data[idx], dist[idx]
|
||||
max_dist = np.max(d_nn)
|
||||
w = (1 - (d_nn / max_dist) ** 3) ** 3 if max_dist > 0 else np.ones(k)
|
||||
A = np.column_stack([np.ones(k), x_nn, x_nn**2]); W = np.diag(w)
|
||||
try:
|
||||
beta = np.linalg.solve(A.T @ W @ A, A.T @ W @ y_nn)
|
||||
y_smooth[j] = beta[0] + beta[1] * x0 + beta[2] * x0**2
|
||||
except np.linalg.LinAlgError:
|
||||
y_smooth[j] = np.average(y_nn, weights=w)
|
||||
return np.maximum(y_smooth, 0)
|
||||
|
||||
clean_mask = ~is_affected
|
||||
clean_x, clean_y, clean_dow = x[clean_mask], values[clean_mask], dow[clean_mask]
|
||||
|
||||
# 第一轮:LOESS 拟合原始趋势
|
||||
trend_raw = loess_smooth(clean_x, clean_y, x, frac=0.28, deg=2)
|
||||
|
||||
# ==================== 星期效应:比例方式 ====================
|
||||
# 在干净数据上计算残差率 = (实际 - 趋势) / 趋势
|
||||
residual_pct = (clean_y - trend_raw[clean_mask]) / trend_raw[clean_mask]
|
||||
dow_ratio = np.zeros(7)
|
||||
for d in range(7):
|
||||
mask = clean_dow == d
|
||||
if np.sum(mask) > 0:
|
||||
dow_ratio[d] = np.mean(residual_pct[mask])
|
||||
|
||||
# 调整使比例总和为0(不影响总注册人数)
|
||||
dow_ratio -= np.mean(dow_ratio)
|
||||
|
||||
print("=" * 80)
|
||||
print("2026年3-4月 每日新增注册回归分析 V4(比例潮汐 + 保底)")
|
||||
print("=" * 80)
|
||||
|
||||
print(f"\n📊 星期潮汐效应(比例方式):")
|
||||
dow_names = ['周一', '周二', '周三', '周四', '周五', '周六', '周日']
|
||||
for i, name in enumerate(dow_names):
|
||||
bar = '█' * max(1, int(abs(dow_ratio[i]) * 200))
|
||||
sign = '+' if dow_ratio[i] >= 0 else ''
|
||||
print(f" {name}: {sign}{dow_ratio[i]*100:.1f}% {bar}")
|
||||
|
||||
# ==================== 最终拟合 = 趋势 × (1 + 星期比例) ====================
|
||||
trend_with_dow = trend_raw * (1 + dow_ratio[dow])
|
||||
trend_with_dow = np.maximum(trend_with_dow, 0)
|
||||
|
||||
# ==================== 有效注册人数 ====================
|
||||
# 规则:活动日 → min(实际, 拟合);非活动日 → 保留实际值
|
||||
effective = np.where(is_affected, np.minimum(values, trend_with_dow), values)
|
||||
|
||||
print(f"\n{'='*80}")
|
||||
print("📋 2026年3-4月 最终结果")
|
||||
print(f"{'='*80}")
|
||||
print(f"{'日期':>12} {'星期':>4} | {'实际':>5} | {'影响':>10} | {'趋势':>6} | {'趋势+潮汐':>9} | {'有效注册':>8} | {'剔除':>6}")
|
||||
print("-" * 95)
|
||||
|
||||
mar_start = dates.index('2026-03-01')
|
||||
apr_end = dates.index('2026-04-30') + 1
|
||||
|
||||
for i in range(mar_start, apr_end):
|
||||
d = dates[i]
|
||||
dw = date_series[i].strftime('%a')
|
||||
tag = affected_reason.get(d, '—')
|
||||
trend_val = trend_raw[i]
|
||||
tdow_val = trend_with_dow[i]
|
||||
eff_val = effective[i]
|
||||
removed = values[i] - eff_val
|
||||
print(f"{d:>12} {dw:>4} | {values[i]:>5.0f} | {tag:>10} | {trend_val:>6.0f} | {tdow_val:>9.0f} | {eff_val:>8.0f} | {removed:>+6.0f}")
|
||||
|
||||
# ==================== 汇总 ====================
|
||||
mar_apr_actual = np.sum(values[mar_start:apr_end])
|
||||
mar_apr_eff = np.sum(effective[mar_start:apr_end])
|
||||
mar_apr_affected_count = int(np.sum(is_affected[mar_start:apr_end]))
|
||||
|
||||
print(f"\n{'='*80}")
|
||||
print("📊 最终汇总")
|
||||
print(f"{'='*80}")
|
||||
print(f" 统计天数: {apr_end - mar_start}")
|
||||
print(f" 活动影响天数: {mar_apr_affected_count}")
|
||||
print(f"")
|
||||
print(f" 实际注册总计: {mar_apr_actual:.0f}")
|
||||
print(f" 有效注册总计: {mar_apr_eff:.0f}")
|
||||
print(f" 剔除活动量: {mar_apr_actual - mar_apr_eff:.0f} 人 ({(mar_apr_actual-mar_apr_eff)/mar_apr_actual*100:.1f}%)")
|
||||
print(f"")
|
||||
print(f" 实际日均: {np.mean(values[mar_start:apr_end]):.0f}")
|
||||
print(f" 有效日均: {np.mean(effective[mar_start:apr_end]):.0f}")
|
||||
print(f"")
|
||||
print(f" 1-2月干净日均: {np.mean(values[:mar_start][clean_mask[:mar_start]]):.0f}")
|
||||
print(f" 3-4月有效日均: {np.mean(effective[mar_start:apr_end]):.0f}")
|
||||
|
||||
# ==================== 对比 V3 固定值 vs V4 比例 ====================
|
||||
print(f"\n{'='*80}")
|
||||
print("📊 V3(固定补偿) vs V4(比例补偿) 对比")
|
||||
print(f"{'='*80}")
|
||||
# 用固定值方式算一遍做对比
|
||||
residuals_abs = clean_y - trend_raw[clean_mask]
|
||||
dow_abs = np.zeros(7)
|
||||
for d in range(7):
|
||||
mask = clean_dow == d
|
||||
if np.sum(mask) > 0:
|
||||
dow_abs[d] = np.mean(residuals_abs[mask])
|
||||
dow_abs -= np.mean(dow_abs)
|
||||
trend_v3 = trend_raw + dow_abs[dow]
|
||||
trend_v3 = np.maximum(trend_v3, 0)
|
||||
eff_v3 = np.where(is_affected, np.minimum(values, trend_v3), values)
|
||||
|
||||
print(f"{'方法':>20} | {'有效总计':>10} | {'剔除':>8} | {'有效日均':>8}")
|
||||
print(f"{'V3 固定值补偿':>20} | {np.sum(eff_v3[mar_start:apr_end]):>10.0f} | {mar_apr_actual - np.sum(eff_v3[mar_start:apr_end]):>8.0f} | {np.mean(eff_v3[mar_start:apr_end]):>8.1f}")
|
||||
print(f"{'V4 比例补偿':>20} | {mar_apr_eff:>10.0f} | {mar_apr_actual - mar_apr_eff:>8.0f} | {np.mean(effective[mar_start:apr_end]):>8.1f}")
|
||||
|
||||
# 列出差异较大的日期
|
||||
print(f"\n V3 vs V4 有效注册差异 > 3 的日期:")
|
||||
diff = effective[mar_start:apr_end] - eff_v3[mar_start:apr_end]
|
||||
for i in range(len(diff)):
|
||||
if abs(diff[i]) > 3:
|
||||
d = dates[mar_start + i]
|
||||
sign = '+' if diff[i] > 0 else ''
|
||||
print(f" {d}: V3={eff_v3[mar_start+i]:.0f}, V4={effective[mar_start+i]:.0f}, 差异={sign}{diff[i]:.0f}")
|
||||
|
||||
# ==================== 导出Excel ====================
|
||||
rows = []
|
||||
for i in range(mar_start, apr_end):
|
||||
rows.append({
|
||||
'日期': dates[i],
|
||||
'星期': date_series[i].strftime('%A'),
|
||||
'实际注册人数': int(values[i]),
|
||||
'是否活动影响': '是' if is_affected[i] else '否',
|
||||
'活动说明': affected_reason.get(dates[i], ''),
|
||||
'LOESS趋势': round(trend_raw[i], 0),
|
||||
'趋势+潮汐(比例)': round(trend_with_dow[i], 0),
|
||||
'有效注册人数(用于转化)': round(effective[i], 0),
|
||||
'剔除人数': round(values[i] - effective[i], 0),
|
||||
})
|
||||
|
||||
output_df = pd.DataFrame(rows)
|
||||
output_path = '/root/.openclaw/workspace/output/2026_3-4月注册回归分析.xlsx'
|
||||
|
||||
with pd.ExcelWriter(output_path, engine='openpyxl') as writer:
|
||||
output_df.to_excel(writer, sheet_name='每日明细', index=False)
|
||||
|
||||
summary_rows = [
|
||||
('统计天数', apr_end - mar_start),
|
||||
('活动影响天数', mar_apr_affected_count),
|
||||
('', ''),
|
||||
('实际注册总计', f'{mar_apr_actual:.0f}'),
|
||||
('有效注册总计(用于转化)', f'{mar_apr_eff:.0f}'),
|
||||
('剔除活动量', f'{mar_apr_actual - mar_apr_eff:.0f}'),
|
||||
('剔除比例', f'{(mar_apr_actual-mar_apr_eff)/mar_apr_actual*100:.1f}%'),
|
||||
('', ''),
|
||||
('实际日均', f'{np.mean(values[mar_start:apr_end]):.1f}'),
|
||||
('有效日均', f'{np.mean(effective[mar_start:apr_end]):.1f}'),
|
||||
('', ''),
|
||||
('1-2月干净数据日均', f'{np.mean(values[:mar_start][clean_mask[:mar_start]]):.1f}'),
|
||||
('3-4月有效数据日均', f'{np.mean(effective[mar_start:apr_end]):.1f}'),
|
||||
('', ''),
|
||||
('星期效应-周一', f'{dow_ratio[0]*100:+.1f}%'),
|
||||
('星期效应-周二', f'{dow_ratio[1]*100:+.1f}%'),
|
||||
('星期效应-周三', f'{dow_ratio[2]*100:+.1f}%'),
|
||||
('星期效应-周四', f'{dow_ratio[3]*100:+.1f}%'),
|
||||
('星期效应-周五', f'{dow_ratio[4]*100:+.1f}%'),
|
||||
('星期效应-周六', f'{dow_ratio[5]*100:+.1f}%'),
|
||||
('星期效应-周日', f'{dow_ratio[6]*100:+.1f}%'),
|
||||
]
|
||||
pd.DataFrame(summary_rows, columns=['指标','值']).to_excel(writer, sheet_name='汇总', index=False)
|
||||
|
||||
print(f"\n✅ Excel 已导出: {output_path}")
|
||||
Loading…
Reference in New Issue
Block a user