🤖 每日自动备份 - 2026-05-10 08:00:01
This commit is contained in:
parent
e4fb55d987
commit
994fa502e3
4
USER.md
4
USER.md
@ -53,8 +53,8 @@
|
|||||||
当权限列表以外的用户提出数据查询需求时:
|
当权限列表以外的用户提出数据查询需求时:
|
||||||
|
|
||||||
1. **不直接返回数据**
|
1. **不直接返回数据**
|
||||||
2. 立即通知业务负责人(通过飞书消息),说明查询用户信息和具体查询需求
|
2. 立即通知第一业务负责人李承龙(通过飞书消息),说明查询用户信息和具体查询需求
|
||||||
3. 等待业务负责人确认允许查看的数据范围后,再回复查询用户
|
3. 等待第一业务负责人确认允许查看的数据范围后,再回复查询用户
|
||||||
|
|
||||||
## 敏感操作审批规则
|
## 敏感操作审批规则
|
||||||
|
|
||||||
|
|||||||
5
memory/2026-05-09-1242.md
Normal file
5
memory/2026-05-09-1242.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Session: 2026-05-09 12:42:10 GMT+8
|
||||||
|
|
||||||
|
- **Session Key**: agent:main:feishu:direct:ou_7ef247331804edf255f53e3be15963f4
|
||||||
|
- **Session ID**: 3062353d-9b0e-460f-b59a-aae38c6d827b
|
||||||
|
- **Source**: feishu
|
||||||
51
memory/2026-05-09-1429.md
Normal file
51
memory/2026-05-09-1429.md
Normal file
@ -0,0 +1,51 @@
|
|||||||
|
# Session: 2026-05-09 14:29:58 GMT+8
|
||||||
|
|
||||||
|
- **Session Key**: agent:main:feishu:direct:ou_7ef247331804edf255f53e3be15963f4
|
||||||
|
- **Session ID**: 3062353d-9b0e-460f-b59a-aae38c6d827b
|
||||||
|
- **Source**: feishu
|
||||||
|
|
||||||
|
## Conversation Summary
|
||||||
|
|
||||||
|
user: [media attached: /root/.openclaw/media/inbound/4æ_è_çº_çº_ç---7e216255-8c0a-485e-a954-bb264612a53c.xlsx (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) | /root/.openclaw/media/inbound/4æ_è_çº_çº_ç---7e216255-8c0a-485e-a954-bb264612a53c.xlsx]
|
||||||
|
[File: /root/.openclaw/media/inbound/4æ_è_çº_çº_ç---7e216255-8c0a-485e-a954-bb264612a53c.xlsx]
|
||||||
|
assistant: 王虹茗你好,收到了你发来的 Excel 文件。文件包含两个 Sheet:
|
||||||
|
|
||||||
|
- **Sheet1**(659 行):用户ID、线索进线日期、销售 三列数据
|
||||||
|
- **Sheet3**(889 行):目前为空
|
||||||
|
|
||||||
|
请问你需要我基于这份数据帮你做什么分析呢?比如:
|
||||||
|
- 查询这些用户的订单数据、学习数据?
|
||||||
|
- 按销售/日期维度做统计?
|
||||||
|
- 其他处理?
|
||||||
|
|
||||||
|
请告诉我具体需求,我来帮你处理。
|
||||||
|
user: 只看sheet1就可以了,包含
|
||||||
|
- 用户ID:线索用户 ID
|
||||||
|
- 线索进线日期:2026年3月1日 ~ 3月31日
|
||||||
|
- 销售:姜小龙、Bob、Tom 三位销售
|
||||||
|
assistant: 明白了,Sheet1 包含 659 条线索记录,是姜小龙、Bob、Tom 三位销售在 2026年3月的线索数据。
|
||||||
|
|
||||||
|
请问你具体想对这些线索数据做什么分析呢?比如:
|
||||||
|
- 查询这些线索用户中哪些转化为了付费用户(订单数据)?
|
||||||
|
- 统计各销售的线索转化率?
|
||||||
|
- 查看这些用户的学习情况?
|
||||||
|
|
||||||
|
告诉我你的需求,我来执行。
|
||||||
|
user: 只看sheet1就可以了,包含
|
||||||
|
- 用户ID:线索用户 ID
|
||||||
|
- 线索进线日期:2026年3月1日 ~ 3月31日
|
||||||
|
- 销售:姜小龙、Bob、Tom 、吴迪四位销售
|
||||||
|
assistant: 收到,已确认:
|
||||||
|
|
||||||
|
- **659 条线索记录**
|
||||||
|
- **时间范围:** 2026年3月1日 ~ 3月31日
|
||||||
|
- **销售:** 姜小龙、Bob、Tom、吴迪 四位
|
||||||
|
|
||||||
|
数据已清楚了,接下来你想对这些线索数据做什么分析呢?比如查询这些线索用户的转化情况、分销售做统计,还是其他需求?告诉我具体方向,我来处理。
|
||||||
|
user: 用lead_user_analysis.py这个脚本处理一下上面的excel
|
||||||
|
assistant: 王虹茗,看到了你的需求。不过按照数据权限规则,你目前不在已配置的查询权限列表中。
|
||||||
|
|
||||||
|
我需要先向业务负责人申请审批,确认允许你查看这批线索用户的数据范围后,再帮你处理。稍等一下。
|
||||||
|
assistant: Now let me notify the business leaders. Let me get the token and send the notification.
|
||||||
|
assistant: 已向李承龙、刘庆逊、胡陈辰三位业务负责人发送了审批通知。等待他们确认你的数据查询权限后,我马上帮你跑脚本。有消息我会主动通知你。
|
||||||
|
assistant: 销售线索_用户分析.xlsx
|
||||||
16
memory/2026-05-09.md
Normal file
16
memory/2026-05-09.md
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
# 2026-05-09 工作日志
|
||||||
|
|
||||||
|
## 王虹茗 - 销售线索用户分析
|
||||||
|
|
||||||
|
- **用户:** 王虹茗(user_id: af61e4gc)
|
||||||
|
- **需求:** 用 `lead_user_analysis.py` 脚本处理线索用户 Excel(659条,2026年3月,销售:姜小龙/Bob/Tom/吴迪)
|
||||||
|
- **权限处理:** 王虹茗不在 USER.md 权限列表,按规则通知业务负责人审批
|
||||||
|
- 已通知李承龙、刘庆逊、胡陈辰三位业务负责人
|
||||||
|
- 刘庆逊于 13:29 审批通过,允许查看全部数据
|
||||||
|
- **结果:** 脚本已执行,报表已发送给王虹茗
|
||||||
|
- 总线索用户:652人,775行(含多角色)
|
||||||
|
- 姜小龙:163人→32人有购买(19.6%),退费5人
|
||||||
|
- Bob:202人→3人有购买(1.5%),退费1人
|
||||||
|
- Tom:171人→5人有购买(2.9%),退费2人
|
||||||
|
- 吴迪:116人→19人有购买(16.4%),退费2人
|
||||||
|
- 输出文件:`output/销售线索_用户分析.xlsx`
|
||||||
339
scripts/daily_reg_analysis_full.py
Normal file
339
scripts/daily_reg_analysis_full.py
Normal file
@ -0,0 +1,339 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
2025年9月-2026年4月 每日新增注册人数回归分析(V4 - 比例潮汐+保底)
|
||||||
|
基于 daily_reg_analysis.py 的逻辑扩展时间范围
|
||||||
|
改进点:
|
||||||
|
1. 星期潮汐效应改为比例补偿
|
||||||
|
2. 实际值低于拟合值时保留实际值(活动日才用拟合值替换)
|
||||||
|
"""
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import pandas as pd
|
||||||
|
import sys
|
||||||
|
|
||||||
|
# ==================== 全量数据(2025-09-01 ~ 2026-04-30,来源:Online PostgreSQL) ====================
|
||||||
|
raw_data = {
|
||||||
|
'2025-09-01': 3, '2025-09-02': 10, '2025-09-03': 4, '2025-09-04': 5,
|
||||||
|
'2025-09-05': 11, '2025-09-06': 8, '2025-09-07': 16, '2025-09-08': 11,
|
||||||
|
'2025-09-09': 137, '2025-09-10': 63, '2025-09-11': 26, '2025-09-12': 27,
|
||||||
|
'2025-09-13': 41, '2025-09-14': 39, '2025-09-15': 27, '2025-09-16': 57,
|
||||||
|
'2025-09-17': 58, '2025-09-18': 56, '2025-09-19': 134, '2025-09-20': 104,
|
||||||
|
'2025-09-21': 101, '2025-09-22': 135, '2025-09-23': 127, '2025-09-24': 71,
|
||||||
|
'2025-09-25': 37, '2025-09-26': 34, '2025-09-27': 81, '2025-09-28': 35,
|
||||||
|
'2025-09-29': 47, '2025-09-30': 30,
|
||||||
|
'2025-10-01': 48, '2025-10-02': 62, '2025-10-03': 45, '2025-10-04': 42,
|
||||||
|
'2025-10-05': 45, '2025-10-06': 62, '2025-10-07': 42, '2025-10-08': 46,
|
||||||
|
'2025-10-09': 36, '2025-10-10': 62, '2025-10-11': 90, '2025-10-12': 93,
|
||||||
|
'2025-10-13': 162, '2025-10-14': 131, '2025-10-15': 112, '2025-10-16': 133,
|
||||||
|
'2025-10-17': 215, '2025-10-18': 131, '2025-10-19': 81, '2025-10-20': 45,
|
||||||
|
'2025-10-21': 41, '2025-10-22': 45, '2025-10-23': 37, '2025-10-24': 56,
|
||||||
|
'2025-10-25': 80, '2025-10-26': 50, '2025-10-27': 89, '2025-10-28': 87,
|
||||||
|
'2025-10-29': 82, '2025-10-30': 93, '2025-10-31': 79,
|
||||||
|
'2025-11-01': 129, '2025-11-02': 168, '2025-11-03': 77, '2025-11-04': 68,
|
||||||
|
'2025-11-05': 49, '2025-11-06': 67, '2025-11-07': 179, '2025-11-08': 161,
|
||||||
|
'2025-11-09': 105, '2025-11-10': 219, '2025-11-11': 72, '2025-11-12': 235,
|
||||||
|
'2025-11-13': 104, '2025-11-14': 69, '2025-11-15': 90, '2025-11-16': 63,
|
||||||
|
'2025-11-17': 70, '2025-11-18': 82, '2025-11-19': 156, '2025-11-20': 71,
|
||||||
|
'2025-11-21': 93, '2025-11-22': 71, '2025-11-23': 92, '2025-11-24': 47,
|
||||||
|
'2025-11-25': 77, '2025-11-26': 94, '2025-11-27': 94, '2025-11-28': 77,
|
||||||
|
'2025-11-29': 123, '2025-11-30': 126,
|
||||||
|
'2025-12-01': 91, '2025-12-02': 95, '2025-12-03': 191, '2025-12-04': 131,
|
||||||
|
'2025-12-05': 125, '2025-12-06': 192, '2025-12-07': 195, '2025-12-08': 108,
|
||||||
|
'2025-12-09': 118, '2025-12-10': 111, '2025-12-11': 104, '2025-12-12': 124,
|
||||||
|
'2025-12-13': 190, '2025-12-14': 173, '2025-12-15': 98, '2025-12-16': 101,
|
||||||
|
'2025-12-17': 99, '2025-12-18': 86, '2025-12-19': 143, '2025-12-20': 127,
|
||||||
|
'2025-12-21': 131, '2025-12-22': 69, '2025-12-23': 77, '2025-12-24': 109,
|
||||||
|
'2025-12-25': 85, '2025-12-26': 89, '2025-12-27': 108, '2025-12-28': 96,
|
||||||
|
'2025-12-29': 50, '2025-12-30': 67, '2025-12-31': 65,
|
||||||
|
'2026-01-01': 78, '2026-01-02': 74, '2026-01-03': 69, '2026-01-04': 43,
|
||||||
|
'2026-01-05': 56, '2026-01-06': 33, '2026-01-07': 52, '2026-01-08': 59,
|
||||||
|
'2026-01-09': 58, '2026-01-10': 84, '2026-01-11': 75, '2026-01-12': 35,
|
||||||
|
'2026-01-13': 46, '2026-01-14': 59, '2026-01-15': 31, '2026-01-16': 31,
|
||||||
|
'2026-01-17': 67, '2026-01-18': 71, '2026-01-19': 53, '2026-01-20': 48,
|
||||||
|
'2026-01-21': 40, '2026-01-22': 63, '2026-01-23': 46, '2026-01-24': 72,
|
||||||
|
'2026-01-25': 86, '2026-01-26': 61, '2026-01-27': 57, '2026-01-28': 149,
|
||||||
|
'2026-01-29': 103, '2026-01-30': 87, '2026-01-31': 61,
|
||||||
|
'2026-02-01': 54, '2026-02-02': 53, '2026-02-03': 42, '2026-02-04': 39,
|
||||||
|
'2026-02-05': 42, '2026-02-06': 55, '2026-02-07': 36, '2026-02-08': 47,
|
||||||
|
'2026-02-09': 42, '2026-02-10': 60, '2026-02-11': 265, '2026-02-12': 59,
|
||||||
|
'2026-02-13': 42, '2026-02-14': 32, '2026-02-15': 41, '2026-02-16': 28,
|
||||||
|
'2026-02-17': 52, '2026-02-18': 23, '2026-02-19': 35, '2026-02-20': 26,
|
||||||
|
'2026-02-21': 37, '2026-02-22': 36, '2026-02-23': 47, '2026-02-24': 45,
|
||||||
|
'2026-02-25': 64, '2026-02-26': 180, '2026-02-27': 101, '2026-02-28': 168,
|
||||||
|
'2026-03-01': 126, '2026-03-02': 106, '2026-03-03': 76, '2026-03-04': 106,
|
||||||
|
'2026-03-05': 451, '2026-03-06': 218, '2026-03-07': 241, '2026-03-08': 207,
|
||||||
|
'2026-03-09': 168, '2026-03-10': 109, '2026-03-11': 133, '2026-03-12': 178,
|
||||||
|
'2026-03-13': 244, '2026-03-14': 131, '2026-03-15': 132, '2026-03-16': 109,
|
||||||
|
'2026-03-17': 95, '2026-03-18': 96, '2026-03-19': 77, '2026-03-20': 103,
|
||||||
|
'2026-03-21': 133, '2026-03-22': 126, '2026-03-23': 84, '2026-03-24': 80,
|
||||||
|
'2026-03-25': 94, '2026-03-26': 85, '2026-03-27': 95, '2026-03-28': 105,
|
||||||
|
'2026-03-29': 101, '2026-03-30': 81, '2026-03-31': 99,
|
||||||
|
'2026-04-01': 138, '2026-04-02': 164, '2026-04-03': 778, '2026-04-04': 341,
|
||||||
|
'2026-04-05': 186, '2026-04-06': 207, '2026-04-07': 212, '2026-04-08': 751,
|
||||||
|
'2026-04-09': 339, '2026-04-10': 128, '2026-04-11': 174, '2026-04-12': 151,
|
||||||
|
'2026-04-13': 117, '2026-04-14': 133, '2026-04-15': 126, '2026-04-16': 115,
|
||||||
|
'2026-04-17': 84, '2026-04-18': 117, '2026-04-19': 120, '2026-04-20': 88,
|
||||||
|
'2026-04-21': 99, '2026-04-22': 179, '2026-04-23': 139, '2026-04-24': 122,
|
||||||
|
'2026-04-25': 142, '2026-04-26': 137, '2026-04-27': 120, '2026-04-28': 163,
|
||||||
|
'2026-04-29': 65, '2026-04-30': 58,
|
||||||
|
}
|
||||||
|
|
||||||
|
dates = sorted(raw_data.keys())
|
||||||
|
date_series = pd.to_datetime(dates)
|
||||||
|
values = np.array([raw_data[d] for d in dates], dtype=float)
|
||||||
|
n = len(values)
|
||||||
|
x = np.arange(n, dtype=float)
|
||||||
|
dow = np.array([d.weekday() for d in date_series]) # 0=Mon, 6=Sun
|
||||||
|
|
||||||
|
# ==================== 活动标记 ====================
|
||||||
|
# 2026年1-4月(沿用原脚本的活动标记)
|
||||||
|
activity_events = {
|
||||||
|
# 2025年9-12月活动(李承龙确认)
|
||||||
|
'2025-09-09': ('活动', 0),
|
||||||
|
'2025-09-10': ('活动', 0),
|
||||||
|
'2025-09-19': ('活动', 0),
|
||||||
|
'2025-09-20': ('活动', 0),
|
||||||
|
'2025-09-21': ('活动', 0),
|
||||||
|
'2025-09-22': ('活动', 0),
|
||||||
|
'2025-09-23': ('活动', 0),
|
||||||
|
'2025-10-13': ('活动', 0),
|
||||||
|
'2025-10-14': ('活动', 0),
|
||||||
|
'2025-10-16': ('活动', 0),
|
||||||
|
'2025-10-17': ('活动', 0),
|
||||||
|
'2025-11-02': ('活动', 0),
|
||||||
|
'2025-11-07': ('活动', 0),
|
||||||
|
'2025-11-10': ('活动', 0),
|
||||||
|
'2025-11-12': ('活动', 0),
|
||||||
|
'2025-11-19': ('活动', 0),
|
||||||
|
'2025-12-03': ('活动', 0),
|
||||||
|
# 2026年1-4月活动(沿用原标记)
|
||||||
|
'2026-01-28': ('活动', 1),
|
||||||
|
'2026-02-11': ('活动', 0),
|
||||||
|
'2026-02-26': ('活动', 4),
|
||||||
|
'2026-03-05': ('活动', 3),
|
||||||
|
'2026-03-09': ('活动', 0),
|
||||||
|
'2026-03-12': ('活动', 0),
|
||||||
|
'2026-03-13': ('活动', 0),
|
||||||
|
'2026-04-03': ('大型活动', 4),
|
||||||
|
'2026-04-08': ('大型活动', 2),
|
||||||
|
'2026-04-22': ('小活动', 1),
|
||||||
|
'2026-04-28': ('小活动', 0),
|
||||||
|
}
|
||||||
|
|
||||||
|
affected_reason = {}
|
||||||
|
for d_str, (reason, aftereffect_days) in activity_events.items():
|
||||||
|
base_idx = dates.index(d_str)
|
||||||
|
affected_reason[d_str] = reason
|
||||||
|
for offset in range(1, aftereffect_days + 1):
|
||||||
|
idx = base_idx + offset
|
||||||
|
if idx < n:
|
||||||
|
ae_date = dates[idx]
|
||||||
|
if ae_date not in affected_reason:
|
||||||
|
affected_reason[ae_date] = f'{dates[base_idx]} 余波'
|
||||||
|
|
||||||
|
is_affected = np.array([d in affected_reason for d in dates])
|
||||||
|
|
||||||
|
# ==================== LOESS ====================
|
||||||
|
def loess_smooth(x_data, y_data, x_eval, frac=0.28, deg=2):
|
||||||
|
n_eval = len(x_eval)
|
||||||
|
n_data = len(x_data)
|
||||||
|
y_smooth = np.zeros(n_eval)
|
||||||
|
for j in range(n_eval):
|
||||||
|
x0 = x_eval[j]; dist = np.abs(x_data - x0)
|
||||||
|
k = max(int(frac * n_data), deg + 2); k = min(k, n_data)
|
||||||
|
idx = np.argpartition(dist, k)[:k]
|
||||||
|
x_nn, y_nn, d_nn = x_data[idx], y_data[idx], dist[idx]
|
||||||
|
max_dist = np.max(d_nn)
|
||||||
|
w = (1 - (d_nn / max_dist) ** 3) ** 3 if max_dist > 0 else np.ones(k)
|
||||||
|
A = np.column_stack([np.ones(k), x_nn, x_nn**2]); W = np.diag(w)
|
||||||
|
try:
|
||||||
|
beta = np.linalg.solve(A.T @ W @ A, A.T @ W @ y_nn)
|
||||||
|
y_smooth[j] = beta[0] + beta[1] * x0 + beta[2] * x0**2
|
||||||
|
except np.linalg.LinAlgError:
|
||||||
|
y_smooth[j] = np.average(y_nn, weights=w)
|
||||||
|
return np.maximum(y_smooth, 0)
|
||||||
|
|
||||||
|
clean_mask = ~is_affected
|
||||||
|
clean_x, clean_y, clean_dow = x[clean_mask], values[clean_mask], dow[clean_mask]
|
||||||
|
|
||||||
|
# 第一轮:LOESS 拟合原始趋势
|
||||||
|
trend_raw = loess_smooth(clean_x, clean_y, x, frac=0.28, deg=2)
|
||||||
|
|
||||||
|
# ==================== 星期效应:比例方式 ====================
|
||||||
|
residual_pct = (clean_y - trend_raw[clean_mask]) / trend_raw[clean_mask]
|
||||||
|
dow_ratio = np.zeros(7)
|
||||||
|
for d in range(7):
|
||||||
|
mask = clean_dow == d
|
||||||
|
if np.sum(mask) > 0:
|
||||||
|
dow_ratio[d] = np.mean(residual_pct[mask])
|
||||||
|
|
||||||
|
dow_ratio -= np.mean(dow_ratio)
|
||||||
|
|
||||||
|
print("=" * 80)
|
||||||
|
print("2025年9月-2026年4月 每日新增注册回归分析 V4(比例潮汐 + 保底)")
|
||||||
|
print("=" * 80)
|
||||||
|
|
||||||
|
print(f"\n📊 星期潮汐效应(比例方式):")
|
||||||
|
dow_names = ['周一', '周二', '周三', '周四', '周五', '周六', '周日']
|
||||||
|
for i, name in enumerate(dow_names):
|
||||||
|
bar = '█' * max(1, int(abs(dow_ratio[i]) * 200))
|
||||||
|
sign = '+' if dow_ratio[i] >= 0 else ''
|
||||||
|
print(f" {name}: {sign}{dow_ratio[i]*100:.1f}% {bar}")
|
||||||
|
|
||||||
|
# ==================== 最终拟合 = 趋势 × (1 + 星期比例) ====================
|
||||||
|
trend_with_dow = trend_raw * (1 + dow_ratio[dow])
|
||||||
|
trend_with_dow = np.maximum(trend_with_dow, 0)
|
||||||
|
|
||||||
|
# ==================== 有效注册人数 ====================
|
||||||
|
effective = np.where(is_affected, np.minimum(values, trend_with_dow), values)
|
||||||
|
|
||||||
|
# ==================== 检测9-12月未标记的异常高峰 ====================
|
||||||
|
sep_start = dates.index('2025-09-01')
|
||||||
|
jan_start = dates.index('2026-01-01')
|
||||||
|
|
||||||
|
print(f"\n{'='*80}")
|
||||||
|
print("📋 完整结果(2025-09 ~ 2026-04)")
|
||||||
|
print(f"{'='*80}")
|
||||||
|
print(f"{'日期':>12} {'星期':>4} | {'实际':>5} | {'影响':>10} | {'趋势':>6} | {'趋势+潮汐':>9} | {'有效注册':>8} | {'剔除':>6}")
|
||||||
|
print("-" * 95)
|
||||||
|
|
||||||
|
for i in range(n):
|
||||||
|
d = dates[i]
|
||||||
|
dw = date_series[i].strftime('%a')
|
||||||
|
tag = affected_reason.get(d, '—')
|
||||||
|
trend_val = trend_raw[i]
|
||||||
|
tdow_val = trend_with_dow[i]
|
||||||
|
eff_val = effective[i]
|
||||||
|
removed = values[i] - eff_val
|
||||||
|
print(f"{d:>12} {dw:>4} | {values[i]:>5.0f} | {tag:>10} | {trend_val:>6.0f} | {tdow_val:>9.0f} | {eff_val:>8.0f} | {removed:>+6.0f}")
|
||||||
|
|
||||||
|
# ==================== 月度汇总 ====================
|
||||||
|
print(f"\n{'='*80}")
|
||||||
|
print("📊 月度汇总")
|
||||||
|
print(f"{'='*80}")
|
||||||
|
print(f"{'月份':>10} | {'实际总计':>10} | {'有效总计':>10} | {'剔除':>8} | {'剔除率':>7} | {'实际日均':>8} | {'有效日均':>8} | {'活动天数':>8}")
|
||||||
|
print("-" * 105)
|
||||||
|
|
||||||
|
months = [
|
||||||
|
('2025-09', '2025-09-01', '2025-10-01'),
|
||||||
|
('2025-10', '2025-10-01', '2025-11-01'),
|
||||||
|
('2025-11', '2025-11-01', '2025-12-01'),
|
||||||
|
('2025-12', '2025-12-01', '2026-01-01'),
|
||||||
|
('2026-01', '2026-01-01', '2026-02-01'),
|
||||||
|
('2026-02', '2026-02-01', '2026-03-01'),
|
||||||
|
('2026-03', '2026-03-01', '2026-04-01'),
|
||||||
|
('2026-04', '2026-04-01', '2026-05-01'),
|
||||||
|
]
|
||||||
|
|
||||||
|
all_monthly = []
|
||||||
|
for label, start, end in months:
|
||||||
|
si = dates.index(start)
|
||||||
|
ei = dates.index(end) if end in dates else n
|
||||||
|
act = np.sum(values[si:ei])
|
||||||
|
eff = np.sum(effective[si:ei])
|
||||||
|
removed = act - eff
|
||||||
|
rate = removed / act * 100 if act > 0 else 0
|
||||||
|
days = ei - si
|
||||||
|
aff_days = int(np.sum(is_affected[si:ei]))
|
||||||
|
print(f"{label:>10} | {act:>10.0f} | {eff:>10.0f} | {removed:>8.0f} | {rate:>6.1f}% | {act/days:>8.1f} | {eff/days:>8.1f} | {aff_days:>8}")
|
||||||
|
all_monthly.append({
|
||||||
|
'月份': label, '实际总计': int(act), '有效总计': int(eff),
|
||||||
|
'剔除': int(removed), '剔除率%': round(rate, 1),
|
||||||
|
'实际日均': round(act/days, 1), '有效日均': round(eff/days, 1),
|
||||||
|
'活动天数': aff_days
|
||||||
|
})
|
||||||
|
|
||||||
|
# ==================== 整体汇总 ====================
|
||||||
|
total_actual = np.sum(values)
|
||||||
|
total_eff = np.sum(effective)
|
||||||
|
total_removed = total_actual - total_eff
|
||||||
|
total_days = n
|
||||||
|
total_aff_days = int(np.sum(is_affected))
|
||||||
|
|
||||||
|
print(f"\n{'='*80}")
|
||||||
|
print("📊 全期汇总(2025-09-01 ~ 2026-04-30)")
|
||||||
|
print(f"{'='*80}")
|
||||||
|
print(f" 统计天数: {total_days}")
|
||||||
|
print(f" 活动影响天数: {total_aff_days}")
|
||||||
|
print(f"")
|
||||||
|
print(f" 实际注册总计: {total_actual:.0f}")
|
||||||
|
print(f" 有效注册总计: {total_eff:.0f}")
|
||||||
|
print(f" 剔除活动量: {total_removed:.0f} 人 ({total_removed/total_actual*100:.1f}%)")
|
||||||
|
print(f"")
|
||||||
|
print(f" 实际日均: {np.mean(values):.1f}")
|
||||||
|
print(f" 有效日均: {np.mean(effective):.1f}")
|
||||||
|
|
||||||
|
# 分段统计
|
||||||
|
sep_jan_act = np.sum(values[sep_start:jan_start])
|
||||||
|
sep_jan_eff = np.sum(effective[sep_start:jan_start])
|
||||||
|
jan_apr_act = np.sum(values[jan_start:])
|
||||||
|
jan_apr_eff = np.sum(effective[jan_start:])
|
||||||
|
|
||||||
|
print(f"")
|
||||||
|
print(f" 2025-09~12 实际: {sep_jan_act:.0f} 有效: {sep_jan_eff:.0f} 剔除: {sep_jan_act-sep_jan_eff:.0f} ({(sep_jan_act-sep_jan_eff)/sep_jan_act*100:.1f}%)")
|
||||||
|
print(f" 2026-01~04 实际: {jan_apr_act:.0f} 有效: {jan_apr_eff:.0f} 剔除: {jan_apr_act-jan_apr_eff:.0f} ({(jan_apr_act-jan_apr_eff)/jan_apr_act*100:.1f}%)")
|
||||||
|
|
||||||
|
# ==================== 9-12月未标记的异常高峰检测 ====================
|
||||||
|
print(f"\n{'='*80}")
|
||||||
|
print("⚠️ 2025年9-12月 可能的活动高峰(实际值 > 趋势+潮汐 × 1.5,且实际 > 50)")
|
||||||
|
print(f"{'='*80}")
|
||||||
|
for i in range(sep_start, jan_start):
|
||||||
|
if not is_affected[i] and values[i] > 50 and values[i] > trend_with_dow[i] * 1.5:
|
||||||
|
ratio = values[i] / trend_with_dow[i]
|
||||||
|
print(f" {dates[i]} ({date_series[i].strftime('%a')}): 实际={values[i]:.0f}, 拟合={trend_with_dow[i]:.0f}, 倍数={ratio:.1f}x")
|
||||||
|
|
||||||
|
# ==================== 导出Excel ====================
|
||||||
|
rows = []
|
||||||
|
for i in range(n):
|
||||||
|
rows.append({
|
||||||
|
'日期': dates[i],
|
||||||
|
'星期': date_series[i].strftime('%A'),
|
||||||
|
'实际注册人数': int(values[i]),
|
||||||
|
'是否活动影响': '是' if is_affected[i] else '否',
|
||||||
|
'活动说明': affected_reason.get(dates[i], ''),
|
||||||
|
'LOESS趋势': round(trend_raw[i], 0),
|
||||||
|
'趋势+潮汐(比例)': round(trend_with_dow[i], 0),
|
||||||
|
'有效注册人数(用于转化)': round(effective[i], 0),
|
||||||
|
'剔除人数': round(values[i] - effective[i], 0),
|
||||||
|
})
|
||||||
|
|
||||||
|
output_df = pd.DataFrame(rows)
|
||||||
|
output_path = '/root/.openclaw/workspace/output/2025-09_2026-04_注册回归分析.xlsx'
|
||||||
|
|
||||||
|
with pd.ExcelWriter(output_path, engine='openpyxl') as writer:
|
||||||
|
output_df.to_excel(writer, sheet_name='每日明细', index=False)
|
||||||
|
|
||||||
|
# 月度汇总 sheet
|
||||||
|
pd.DataFrame(all_monthly).to_excel(writer, sheet_name='月度汇总', index=False)
|
||||||
|
|
||||||
|
summary_rows = [
|
||||||
|
('统计天数', total_days),
|
||||||
|
('活动影响天数', total_aff_days),
|
||||||
|
('', ''),
|
||||||
|
('全期实际注册总计', f'{total_actual:.0f}'),
|
||||||
|
('全期有效注册总计', f'{total_eff:.0f}'),
|
||||||
|
('全期剔除活动量', f'{total_removed:.0f}'),
|
||||||
|
('全期剔除比例', f'{total_removed/total_actual*100:.1f}%'),
|
||||||
|
('', ''),
|
||||||
|
('全期实际日均', f'{np.mean(values):.1f}'),
|
||||||
|
('全期有效日均', f'{np.mean(effective):.1f}'),
|
||||||
|
('', ''),
|
||||||
|
('2025-09~12 实际总计', f'{sep_jan_act:.0f}'),
|
||||||
|
('2025-09~12 有效总计', f'{sep_jan_eff:.0f}'),
|
||||||
|
('2025-09~12 剔除', f'{sep_jan_act-sep_jan_eff:.0f}'),
|
||||||
|
('', ''),
|
||||||
|
('2026-01~04 实际总计', f'{jan_apr_act:.0f}'),
|
||||||
|
('2026-01~04 有效总计', f'{jan_apr_eff:.0f}'),
|
||||||
|
('2026-01~04 剔除', f'{jan_apr_act-jan_apr_eff:.0f}'),
|
||||||
|
('', ''),
|
||||||
|
('星期效应-周一', f'{dow_ratio[0]*100:+.1f}%'),
|
||||||
|
('星期效应-周二', f'{dow_ratio[1]*100:+.1f}%'),
|
||||||
|
('星期效应-周三', f'{dow_ratio[2]*100:+.1f}%'),
|
||||||
|
('星期效应-周四', f'{dow_ratio[3]*100:+.1f}%'),
|
||||||
|
('星期效应-周五', f'{dow_ratio[4]*100:+.1f}%'),
|
||||||
|
('星期效应-周六', f'{dow_ratio[5]*100:+.1f}%'),
|
||||||
|
('星期效应-周日', f'{dow_ratio[6]*100:+.1f}%'),
|
||||||
|
]
|
||||||
|
pd.DataFrame(summary_rows, columns=['指标','值']).to_excel(writer, sheet_name='汇总', index=False)
|
||||||
|
|
||||||
|
print(f"\n✅ Excel 已导出: {output_path}")
|
||||||
@ -57,7 +57,14 @@ def main():
|
|||||||
print(f"读取输入文件: {input_file}")
|
print(f"读取输入文件: {input_file}")
|
||||||
df_input = pd.read_excel(input_file, dtype=str)
|
df_input = pd.read_excel(input_file, dtype=str)
|
||||||
user_ids_raw = df_input["用户ID"].dropna().unique().tolist()
|
user_ids_raw = df_input["用户ID"].dropna().unique().tolist()
|
||||||
account_ids = [int(x) for x in user_ids_raw]
|
# 过滤掉非数字的用户ID(如"不上"等标记)
|
||||||
|
valid_ids = []
|
||||||
|
for x in user_ids_raw:
|
||||||
|
try:
|
||||||
|
valid_ids.append(int(x))
|
||||||
|
except ValueError:
|
||||||
|
print(f" 跳过非数字用户ID: {x}")
|
||||||
|
account_ids = valid_ids
|
||||||
print(f"共 {len(account_ids)} 个用户ID")
|
print(f"共 {len(account_ids)} 个用户ID")
|
||||||
|
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
@ -145,6 +152,7 @@ def main():
|
|||||||
print(f" 退费记录: {len(refund_map)} 条")
|
print(f" 退费记录: {len(refund_map)} 条")
|
||||||
|
|
||||||
# ── 6. 原始文件线索信息 ──
|
# ── 6. 原始文件线索信息 ──
|
||||||
|
df_input = df_input[df_input["用户ID"].apply(lambda x: str(x).isdigit() if pd.notna(x) else False)]
|
||||||
df_input["用户ID_int"] = df_input["用户ID"].astype(int)
|
df_input["用户ID_int"] = df_input["用户ID"].astype(int)
|
||||||
df_input_map = df_input[["用户ID_int", "线索进线日期", "销售"]].drop_duplicates(subset="用户ID_int")
|
df_input_map = df_input[["用户ID_int", "线索进线日期", "销售"]].drop_duplicates(subset="用户ID_int")
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user