🤖 每日自动备份 - 2026-05-30 08:00:01
This commit is contained in:
parent
3eddd83de9
commit
baf76a77dd
15
MEMORY.md
15
MEMORY.md
@ -241,3 +241,18 @@
|
||||
- **统计标识:** 所有统计统一使用 `userId` 作为用户唯一标识,忽略 `accountId`
|
||||
- **原因:** `accountId=0` 代表匿名用户(H5 页面曝光时等步骤),`userId` 为设备/会话级标识,在匿名和登录状态均存在,更适合做漏斗串联
|
||||
- **例外:** 「领课加微」相关埋点无 `userId` 字段,仅能做事件级统计
|
||||
|
||||
## 行课查询自动回填 [陈逸鸫确认 2026-05-29]
|
||||
|
||||
### 触发方式
|
||||
1. **定时巡检:** 每天 8:00-22:00 每 30 分钟自动跑一次 `scripts/auto_xingke_query.py`
|
||||
2. **群聊关键词触发:** 在群 `oc_1b01b6f076d39f17be75ca52e585125f` 中任何人说 **「处理行课」** 或 **「跑行课」** 时,立即执行该脚本
|
||||
|
||||
### 流程
|
||||
- **数据来源:** 飞书表格 `RFIJsXT8FhGHhctY4RwczcOfnac`,由大麦(另一个 bot)写入待查询数据
|
||||
- **Sheet 2DOxEI — 手机号→ID 匹配:** 读取"待查询ID"行 → `bi_vala_app_account.tel` 脱敏匹配 account_id → 回填 F(用户ID)/G(回填状态)/H(更新时间) 列 → E列标记"已处理"
|
||||
- **Sheet 55b0eb — 行课记录查询:** 读取"待查询"行 → 查角色→课时完成记录(8张分表)→ 学习总耗时→付费状态→激活状态 → 回填 D(行课记录)/E(更新时间) 列
|
||||
- **脚本:** `scripts/auto_xingke_query.py`
|
||||
- **日志:** `/var/log/xiaoxi_xingke_query.log`
|
||||
- **匹配规则:** 手机号脱敏为 `前3****后4` 格式,通过 `bi_vala_app_account.tel` 匹配,仅 `status=1 AND deleted_at IS NULL`
|
||||
- **行课记录格式:** `销售:XXX | 用户:ID | 当前:A1体验课-L1-S0-U00-L01 | 最近行课:2026-05-28 | 学习120min | 已付费`
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
{
|
||||
"version": 1,
|
||||
"updatedAt": "2026-05-28T08:51:30.745Z",
|
||||
"updatedAt": "2026-05-29T06:34:35.897Z",
|
||||
"entries": {
|
||||
"memory:memory/2026-05-06.md:1:20": {
|
||||
"key": "memory:memory/2026-05-06.md:1:20",
|
||||
@ -301,18 +301,20 @@
|
||||
"endLine": 29,
|
||||
"source": "memory",
|
||||
"snippet": "# 2026-05-28 工作日志 ## [李承龙] stride_phone_lead 表查询与飞书表格创建 - **数据库:** 在线 PostgreSQL `vala_class` 库(新发现的数据库,之前未记录在 TOOLS.md 中) - **表结构:** `stride_phone_lead`(手机号线索表) - 字段:id, phone_masked, phone_encrypt, account_id, source_l1, source_l2, source_l3, created_at, updated_at - 索引:主键(id), 唯一(phone_encrypt), 普通(account_id, source_l3, updated_at) - phone_encrypt 有唯一约束用于手机号去重 - **权限:** `ai_member` 用户最初对该表无 SELECT 权限,李承龙临时授权后可用 - **vala_class 库中 ai_member 可读表:** account_wx_id, parent_address, user_info, student_info, stride_phone_lead(共5张) ## [李承龙] stride_phone_lead 飞书表格 - **飞书表格:** https://makee-interactive.feishu.cn/sheets/TWeHsCX6JhhsjntBagHcoJcJned - **Sheet 配置:** - 明细数据 (2dBSrS):全量明细 -",
|
||||
"recallCount": 1,
|
||||
"recallCount": 2,
|
||||
"dailyCount": 0,
|
||||
"groundedCount": 0,
|
||||
"totalScore": 1,
|
||||
"totalScore": 2,
|
||||
"maxScore": 1,
|
||||
"firstRecalledAt": "2026-05-28T08:51:30.745Z",
|
||||
"lastRecalledAt": "2026-05-28T08:51:30.745Z",
|
||||
"lastRecalledAt": "2026-05-29T06:34:35.897Z",
|
||||
"queryHashes": [
|
||||
"a2d80701d49d"
|
||||
"a2d80701d49d",
|
||||
"2168d4aeddb5"
|
||||
],
|
||||
"recallDays": [
|
||||
"2026-05-28"
|
||||
"2026-05-28",
|
||||
"2026-05-29"
|
||||
],
|
||||
"conceptTags": [
|
||||
"stride-phone-lead",
|
||||
@ -332,18 +334,20 @@
|
||||
"endLine": 33,
|
||||
"source": "memory",
|
||||
"snippet": "- 按source_l3去重 (1X5g5z):按 source_l3 维度 account_id 去重计数 ## [李承龙] stride_phone_lead 定时更新 - **定时任务:** 每天 08:00 自动更新飞书表格 - **脚本:** `scripts/update_stride_phone_lead_sheet.py` - **日志:** `/var/log/xiaoxi_stride_phone_lead.log` - **crontab 已添加** ✅ ## 技术备注 - lark-cli 的 `config bind` 在 Linux 上 keychain 损坏,改用飞书 Open API + curl/Python 直接调用 - 飞书 Bot 凭证(小溪):appId=cli_a929ae22e0b8dcc8, secret 见 secrets.env - `vala_class` 数据库需要补充到 TOOLS.md 和 MEMORY.md 中",
|
||||
"recallCount": 1,
|
||||
"recallCount": 2,
|
||||
"dailyCount": 0,
|
||||
"groundedCount": 0,
|
||||
"totalScore": 1,
|
||||
"totalScore": 2,
|
||||
"maxScore": 1,
|
||||
"firstRecalledAt": "2026-05-28T08:51:30.745Z",
|
||||
"lastRecalledAt": "2026-05-28T08:51:30.745Z",
|
||||
"lastRecalledAt": "2026-05-29T06:34:35.897Z",
|
||||
"queryHashes": [
|
||||
"a2d80701d49d"
|
||||
"a2d80701d49d",
|
||||
"2168d4aeddb5"
|
||||
],
|
||||
"recallDays": [
|
||||
"2026-05-28"
|
||||
"2026-05-28",
|
||||
"2026-05-29"
|
||||
],
|
||||
"conceptTags": [
|
||||
"按source-l3去重",
|
||||
|
||||
489
memory/2026-05-29-1222.md
Normal file
489
memory/2026-05-29-1222.md
Normal file
@ -0,0 +1,489 @@
|
||||
# Session: 2026-05-29 12:22:23 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: f8d7444f-54ec-4202-9bd8-c4aaec0f4f50
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
user: 我们来讨论一个数据分析的思路,先不急于跑数落地,我们讨论清楚逻辑再执行。我先给你说说我的想法,然后你来告诉我你的想法。
|
||||
这个数据分析的起因,是我想要在我的APP里面设置一些弹窗,来促使被弹窗触达的用户能够购买我的APP里的课程,提高转化率。那么现在摆在我面前的问题就是,在弹窗哪里弹和给谁弹。所以我希望通过数据分析的方式,来明确这两个问题的结论。
|
||||
首先,我要给你说一些我的APP的基础设定。用户会先下载我的APP,然后激活APP,注册账号,并且创建一个体验课程,这个体验课程是L1或者L2的U0,是免费的,然后用户开始学习,然后购买后续的课程,用户可能在学完序章的时候购买,也可能在学完序章之前购买。
|
||||
然后,我希望通过一些弹窗来引导用户添加我的销售人员的微信,从而让销售人员来引导用户购买课程,因此我需要明确哪些用户更适合添加销售人员的微信,以及再哪些地方弹窗,更容易让用户添加销售人员的微信。
|
||||
我希望你能遍历你所有已知的数据,给我一个你的分析思路。
|
||||
assistant: 好的,我已经遍历了所有已知的数据源。下面是我的完整分析思路。
|
||||
|
||||
---
|
||||
|
||||
## 整体分析框架
|
||||
|
||||
你的问题本质上是一个「精准触达」问题,核心是两个维度:
|
||||
|
||||
- **在哪里弹** → 用户旅程中的最佳触达节点
|
||||
- **给谁弹** → 最有转化潜力的用户分群
|
||||
|
||||
我的思路是:不可能一步到位给出弹窗方案,而是分三层递进分析,每一层用数据缩小不确定性。
|
||||
|
||||
---
|
||||
|
||||
## 第一层:用户旅程全景图(回答「现在是什么样」)
|
||||
|
||||
这一层不做结论,先看清用户从注册到购买的完整面貌。
|
||||
|
||||
### 1.1 用户生命周期阶段定义
|
||||
|
||||
根据你描述的流程,把用户按状态分层:
|
||||
|
||||
```
|
||||
注册 → 创建体验课(L1/L2 U0) → 开始学习U00 → 学完U00 → 购买后续课程
|
||||
```
|
||||
|
||||
每个用户当前处于哪个阶段?每个阶段的「库存」有多少?这是后续分析的底座。
|
||||
|
||||
### 1.2 各阶段转化漏斗
|
||||
|
||||
按上述阶段,算出逐层转化率:
|
||||
|
||||
| 阶段 | 人数 | 转化率 |
|
||||
|------|------|--------|
|
||||
| 注册用户(status=1) | ? | 100% |
|
||||
| 已创建体验课(有U00课程记录) | ? | ?% |
|
||||
| 至少完成 L01 | ? | ?% |
|
||||
| 完成 U00 全部 L01-L05 | ? | ?% |
|
||||
| 完成 U00 后继续学 U01+ | ? | ?% |
|
||||
| 已购买后续课程 | ? | ?% |
|
||||
|
||||
这里有一个关键数据之前跑过:**在沉默用户中,L1 约 8 成、L2 约 7 成断在 U00~U01**。这说明 U00 是最大的决策窗口,弹窗时机大概率落在这个区间。
|
||||
|
||||
### 1.3 购买时间线分布
|
||||
|
||||
对所有付费用户的购买时间做一个分布分析:从注册日到首次购买日,相隔多少天?分布曲线长什么样?
|
||||
|
||||
这个数据直接回答「弹窗应该在第几天触发」。
|
||||
|
||||
---
|
||||
|
||||
## 第二层:购买信号挖掘(回答「谁更可能买」)
|
||||
|
||||
这一层找的是:**在用户没有购买之前,有没有行为信号能提前"预判"他会买?**
|
||||
|
||||
### 2.1 购买者 vs 非购买者的学习行为对比
|
||||
|
||||
把用户分成两组——「最终买了的人」和「一直没买的人」,对比他们在 U00 阶段的行为差异:
|
||||
|
||||
| 行为指标 | 购买者均值 | 非购买者均值 |
|
||||
|----------|-----------|-------------|
|
||||
| U00 完成课时数 | ? | ? |
|
||||
| U00 平均每课耗时(分钟) | ? | ? |
|
||||
| 第一天学习课数 | ? | ? |
|
||||
| 组件正确率(从 ES 埋点 subId=4/ex3 取) | ? | ? |
|
||||
| 首次学习→次日是否回访 | ?% | ?% |
|
||||
| 是否完成 U00 全部 5 课 | ?% | ?% |
|
||||
|
||||
**分析目的**:如果购买者在 U00 阶段的行为模式与非购买者有明显差异,那这些行为就可以作为弹窗的「触发条件」。比如「完成 L03 且正确率 > X%」的用户转化率显著更高,就可以在这个节点弹窗。
|
||||
|
||||
### 2.2 购买发生的课程节点分布
|
||||
|
||||
统计「用户是在学到哪一课时购买的」:
|
||||
|
||||
| 购买节点 | 购买人数 | 占比 |
|
||||
|----------|---------|------|
|
||||
| 学 U00 之前就买了 | ? | ?% |
|
||||
| U00 L01 之后 | ? | ?% |
|
||||
| U00 L02 之后 | ? | ?% |
|
||||
| U00 L03 之后 | ? | ?% |
|
||||
| U00 L04 之后 | ? | ?% |
|
||||
| U00 L05 之后 | ? | ?% |
|
||||
| U01 之后 | ? | ?% |
|
||||
| 更后面 | ? | ?% |
|
||||
|
||||
**分析目的**:如果大部分购买集中在某几个课后,那这些课后就是天然的弹窗节点。之前的数据已经暗示 U00 L01 是最大断点(~30% 用户第一课就停),但我们要看的是**买了的人**集中在哪买,不是**停了的人**停在哪——这是两个不同的问题。
|
||||
|
||||
### 2.3 用户分群
|
||||
|
||||
基于学习行为把非购买用户分成三类:
|
||||
|
||||
- **高潜用户**:学完了 U00 全部 5 课、正确率高、学习频率稳定 → 已经有付费意愿,弹窗主要推「加销售微信了解后续课程」
|
||||
- **犹豫用户**:学到 L02-L04 之间、正确率尚可、但学习间隔开始拉长 → 弹窗推「加微信获取学习建议」,用服务促转化
|
||||
- **流失风险用户**:L01 后长期不回访、或注册后迟迟不开始学习 → 弹窗推「加微信领取福利/试听课」,用利益拉回
|
||||
|
||||
---
|
||||
|
||||
## 第三层:加微行为的现有数据验证(回答「加微能不能促转化」)
|
||||
|
||||
这一层是验证弹窗加微这个策略本身是否成立。
|
||||
|
||||
### 3.1 stride_phone_lead 线索分析
|
||||
|
||||
`vala_class.stride_phone_lead` 表记录了用户提交手机号的行为(即用户主动留资)。这是目前最直接的「用户有意向被联系」的信号。
|
||||
|
||||
可以做的分析:
|
||||
- 留资用户的 account_id 能否关联到 `bi_vala_order`?留资后有多少人买了?
|
||||
- 留资时间 vs 购买时间的时间差分布
|
||||
- 不同 `source_l3`(留资渠道)的转化率对比
|
||||
- 留资用户的特征画像(学习进度、注册渠道等)
|
||||
|
||||
**⚠️ 注意**:这个表目前 ai_member 有读权限(你刚授权过),数据量应该不大,可以直接分析。
|
||||
|
||||
### 3.2 account_wx_id 关联
|
||||
|
||||
`vala_class.account_wx_id` 表记录了账号与微信的绑定关系。如果这个表有数据,可以看看:
|
||||
- 哪些用户绑定了微信?和购买行为有关联吗?
|
||||
- 绑定微信的用户转化率是否更高?
|
||||
|
||||
---
|
||||
|
||||
## 综合分析产出
|
||||
|
||||
跑完上述三层分析后,可以产出一个可执行的弹窗策略矩阵:
|
||||
|
||||
| 触发位置 | 触发条件 | 目标用户 | 弹窗内容 |
|
||||
|----------|---------|---------|---------|
|
||||
| U00 L03 完成后 | 正确率 > 70% + 连续学习 | 高潜用户 | "学得不错!加老师微信了解后续课程" |
|
||||
| U00 L01 完成后 24h 未回访 | 首课完成但未继续 | 流失风险 | "领取免费学习规划,老师 1v1 指导" |
|
||||
| U00 L05 完成后 | 完成全部序章 | 高潜用户 | "序章学完了!后续课程加微信享专属优惠" |
|
||||
| 注册后 7 天未学习 | 有体验课但从未开始 | 召回用户 | "你的免费课程已就绪,加微信激活" |
|
||||
|
||||
---
|
||||
|
||||
## 我看到的局限和补充点
|
||||
|
||||
1. **「加微信」行为目前没有直接的埋点追踪。** `stride_phone_lead` 是留手机号,和「在 APP 内弹窗→加微信」是不同路径;`growth_activity_behavior` 的「领课加微」是纯匿名埋点(没有 userId),无法关联到具体用户。所以我们能做的是**用历史留资数据间接验证加微价值**,然后通过弹窗实验来最终验证。
|
||||
|
||||
2. **ES 的 `user_behavior_buried_points`(2700 万条)是富矿但之前挖得不深。** 里面可能有页面访问、按钮点击、弹窗相关的埋点,如果你们后续在 APP 里埋了弹窗曝光/点击事件,这里就是直接的效果衡量数据源。
|
||||
|
||||
3. **弹窗策略最终需要 A/B 实验验证。** 数据分析能告诉我们「最可能在哪儿弹、弹给谁」,但弹窗的具体文案、样式、频次控制,需要上线后看实际转化率迭代。
|
||||
|
||||
---
|
||||
|
||||
以上就是我的完整分析思路。你觉得这个框架有没有遗漏的维度?或者有没有哪一层你想先深入展开讨论的?我们敲定逻辑之后再动手跑数。
|
||||
user: 我们来更新几个逻辑:
|
||||
|
||||
1. 弹窗分为两类:
|
||||
(a) 我们上面聊到的加销售人员弹窗
|
||||
(b) 直接售卖的弹窗
|
||||
|
||||
我认为在用户旅程中,达到体验最佳节点的时候,应该弹出的是售卖弹窗。那么在用户达到最佳体验之前或者之后,去进行加微弹窗的弹出是更合理的。
|
||||
2. 用户不一定学完U0才会购买课程,他可能在U0的 L1、L2、L3、L4 的过程中就购买了课程。
|
||||
|
||||
另外,只有U0是免费的,U1 不免费,所以用户在购买课程之前是没有办法学习 U1 的
|
||||
3. 第三层那些数据都是无效的:
|
||||
(a) 3.2 的数据是用户购课之后的加 V 行为,所以无效
|
||||
(b) 3.1 是其他渠道的线索,所以也是无效的
|
||||
assistant: 明白了,这三点修正很关键。让我重新梳理——
|
||||
|
||||
**核心变化:**
|
||||
|
||||
1. 现在有两种弹窗,它们不是孤立的,而是围绕用户旅程的"体验峰值"形成一条逻辑链:
|
||||
- **加微弹窗(峰值前)** → 引导用户继续学习,帮他达到峰值
|
||||
- **售卖弹窗(峰值时)** → 在用户最"上头"的时刻直接转化
|
||||
- **加微弹窗(峰值后)** → 峰值没转化的人,交给销售人工跟进
|
||||
|
||||
2. 第三层加微数据全部无效,意味着我们**没有历史数据可以直接验证"加微→购买"这条链路**。这是分析的前提约束,策略效果要靠上线后 A/B 实验来验证。
|
||||
|
||||
3. U1 不免费,U0→U1 是一个硬付费门槛。用户要么在 U0 途中买、要么学完 U0 时买、要么就不买了——不存在「先免费学学 U1」这个选项。
|
||||
|
||||
---
|
||||
|
||||
基于以上,我重新梳理分析框架:
|
||||
|
||||
## 修正后的分析框架
|
||||
|
||||
整个分析围绕一个核心问题展开:
|
||||
|
||||
> **用户旅程中的"体验峰值"在哪里?**
|
||||
|
||||
找到这个峰值,售卖弹窗的位置就确定了。峰值前后的加微弹窗位置也随之确定。
|
||||
|
||||
---
|
||||
|
||||
### 分析一:定位「体验峰值」(回答「售卖弹窗弹在哪」)
|
||||
|
||||
#### 1.1 购买节点的分布
|
||||
|
||||
从 `bi_vala_order` 出发,关联学习记录,看**购买者是在学到哪一课时买的**:
|
||||
|
||||
```
|
||||
购买时机分布(付费用户)
|
||||
├── 开始学习前就买了 → ?人 (?%)
|
||||
├── U00 L01 之后买了 → ?人 (?%)
|
||||
├── U00 L02 之后买了 → ?人 (?%)
|
||||
├── U00 L03 之后买了 → ?人 (?%)
|
||||
├── U00 L04 之后买了 → ?人 (?%)
|
||||
├── U00 L05 之后买了 → ?人 (?%)
|
||||
└── 学完 U00+ 之后才买 → ?人 (?%)(等了一阵才买,说明峰值已过)
|
||||
```
|
||||
|
||||
**判断标准**:哪个课之后购买最集中,哪个就是售卖弹窗的候选位置。
|
||||
|
||||
这里需要分 L1 和 L2 分别统计——两个等级的 U00 难度不同,峰值可能不同。
|
||||
|
||||
#### 1.2 峰值前后购买行为的差异
|
||||
|
||||
在峰值节点之前买的人 vs 之后买的人,学习行为有什么差异?
|
||||
|
||||
| 维度 | 峰值前购买 | 峰值时购买 | 峰值后购买 |
|
||||
|------|-----------|-----------|-----------|
|
||||
| 人数 | ? | ? | ? |
|
||||
| U00 平均正确率 | ? | ? | ? |
|
||||
| 学习频次(课/天) | ? | ? | ? |
|
||||
| 注册→购买天数 | ? | ? | ? |
|
||||
|
||||
**判断标准**:
|
||||
- 如果峰值时购买的人正确率显著更高 → 说明这个节点的确有「体验峰值」效应
|
||||
- 如果峰值前买的人本身就高度活跃 → 不需要加微引导,直接上售卖弹窗就行
|
||||
- 如果峰值后买的人特征是「学习断续后又回来买」→ 加微弹窗的目标用户画像
|
||||
|
||||
---
|
||||
|
||||
### 分析二:峰值前用户分析(回答「加微弹窗——峰值前弹在哪」)
|
||||
|
||||
峰值之前的用户分为两种:
|
||||
|
||||
#### 2.1 正在学习中但还没到峰值
|
||||
|
||||
这些用户在 U00 L01→L02→... 爬坡中。问题是:**有没有一个节点,用户最容易"卡住"?**
|
||||
|
||||
之前我们跑过沉默用户的断点分布:
|
||||
- L01 后沉默 ~30%(第一课就停了)
|
||||
- L02 后沉默 ~20%
|
||||
- L01+L02 合计超 50%
|
||||
|
||||
这说明 L01→L02 之间是最大流失口。如果在 L01 完成后弹出加微弹窗,可能是挽救流失的一个手段。
|
||||
|
||||
但要注意区分:**断点是"沉默用户的断点"**,我们需要看的是「最终买了的人在这个阶段有没有卡过,以及那些没买的人卡在哪」。
|
||||
|
||||
#### 2.2 注册后但迟迟没开始学习
|
||||
|
||||
这批用户有体验课但一次都没学。弹窗策略可能是「唤醒 + 加微引导」。
|
||||
|
||||
---
|
||||
|
||||
### 分析三:峰值后用户分析(回答「加微弹窗——峰值后弹在哪」)
|
||||
|
||||
这是最关键的加微弹窗场景:用户已经体验完了最佳节点,但还没买。
|
||||
|
||||
#### 3.1 峰值后但还在 U00 内
|
||||
|
||||
比如峰值在 L03,那 L04 开始就是「峰值后」。这批用户:
|
||||
- 体验了峰值但没被售卖弹窗转化(或者当时还没弹窗)
|
||||
- 还在继续学习,说明兴趣还在
|
||||
- **这是加微弹窗的最高优先级目标**
|
||||
|
||||
#### 3.2 学完 U00 全部 5 课但没买
|
||||
|
||||
这批用户的现状:
|
||||
- 体验课全部学完,对产品有了完整认知
|
||||
- 卡在 U00→U1 的付费墙前
|
||||
- 兴趣已经被验证(5 课都学完了)
|
||||
|
||||
**这批人是加微弹窗的绝对核心目标**——他们已经证明了学习意愿,只差临门一脚,销售介入效率最高。
|
||||
|
||||
#### 3.3 峰值后但学习频率下降了
|
||||
|
||||
比如峰值在 L03,用户到 L04/L05 时学习间隔从 1 天拉长到 5 天。这个信号意味着用户兴趣在衰减,加微弹窗要趁早触发。
|
||||
|
||||
---
|
||||
|
||||
### 分析四:弹窗策略的整体产出
|
||||
|
||||
分析完成后,输出一张策略矩阵:
|
||||
|
||||
| 旅程阶段 | 触发位置 | 弹窗类型 | 目标用户 | 核心信息 |
|
||||
|---------|---------|---------|---------|---------|
|
||||
| 峰值前 | L01 完成后 + 24h 未继续 | 加微 | 首课即停的流失风险用户 | 「老师帮你规划学习」 |
|
||||
| 峰值前 | L02 完成后 | 加微 | 还在爬坡但节奏慢的用户 | 「你的学习表现不错,加老师微信获取指导」 |
|
||||
| **峰值** | **L0X 完成后** | **售卖** | **所有到达此节点的非购买用户** | **「继续学习,立即购买」** |
|
||||
| 峰值后 | 峰值后下一课完成 | 加微 | 错过售卖弹窗的高意向用户 | 「学得不错!加微信了解完整课程」 |
|
||||
| 峰值后 | U00 全部 5 课完成后 | 加微 | 完课但未购买 | 「序章学完啦,加微信享专属优惠」 |
|
||||
|
||||
---
|
||||
|
||||
### 分析上的局限和务实建议
|
||||
|
||||
1. **没有加微行为数据**意味着我们无法用历史数据做「加微→购买」的归因。分析能做到的是:
|
||||
- 找到最佳售卖弹窗位置(通过购买节点分布)
|
||||
- 找到峰值前后的加微弹窗位置(通过流失/断点分布)
|
||||
- **但无法预估加微弹窗的实际转化率**,这个需要上线后 A/B 实验验证
|
||||
|
||||
2. 一个务实的策略建议:**先上线售卖弹窗**(因为有购买行为支撑,位置更确定),跑通数据后再加入加微弹窗做对比实验。
|
||||
|
||||
---
|
||||
|
||||
你觉得这个修正后的框架逻辑是否成立?特别是"体验峰值"这个核心概念——我们通过购买节点分布来确定它,这个思路你认可吗?
|
||||
user: 思路没问题,跑数吧
|
||||
assistant: 好的,开始跑数。先获取基础信息,再逐层推进。
|
||||
assistant: U00 chapter_id 确认了。现在写一个完整的分析脚本。
|
||||
assistant: 前三个分析已经跑出来了,第四个有列名歧义,修复一下。
|
||||
assistant: 数据全部跑出来了,以下是完整分析结果:
|
||||
|
||||
---
|
||||
|
||||
## 📊 弹窗策略数据分析
|
||||
|
||||
### 分析一:购买节点分布(买了的人在哪买的)
|
||||
|
||||
| 购买节点 | 订单数 | 用户数 | 占比 |
|
||||
|---------|--------|--------|------|
|
||||
| 学习前购买 | 1,586 | 1,554 | 31.5% |
|
||||
| L01之后 | 884 | 879 | 17.8% |
|
||||
| L02之后 | 536 | 535 | 10.8% |
|
||||
| L03之后 | 300 | 297 | 6.0% |
|
||||
| **L05之后** | **1,431** | **1,359** | **27.5%** |
|
||||
| L04之后 | 182 | 181 | 3.7% |
|
||||
|
||||
**核心发现**:购买行为呈"双峰"——31.5% 没学就买,27.5% 学完 U00 全部才买。**L01-L04 途中购买的用户合计只占 38%**,没有单一节点出现爆发式的购买集中。
|
||||
|
||||
按产品拆分:L1 用户 68% 集中在 L05 之后买,L2 用户 40% 学习前就买了——L2 购买意愿明显更强。
|
||||
|
||||
---
|
||||
|
||||
### 分析二:峰值后用户池
|
||||
|
||||
**3,137 人学完 U00 全部 5 课但未付费**,付费率 49.1%。这是加微弹窗最大的存量目标池。
|
||||
|
||||
---
|
||||
|
||||
### 分析三:各断点转化率(到达该课的用户中多少人买了)
|
||||
|
||||
| 进度 | L1 转化率 | L2 转化率 |
|
||||
|------|----------|----------|
|
||||
| 到达 L01 | 31.6% | 26.7% |
|
||||
| 到达 L02 | 39.2% | 34.4% |
|
||||
| 到达 L03 | 45.8% | 39.2% |
|
||||
| 到达 L04 | 50.2% | 43.7% |
|
||||
| **到达 L05** | **54.8%** | **47.7%** |
|
||||
|
||||
转化率随学习进度**线性递增**,没有跳跃式拐点。这说明学习行为本身就是最强的筛选机制——学得越多,越可能买。
|
||||
|
||||
---
|
||||
|
||||
### 分析四:各节点沉默用户规模
|
||||
|
||||
| 断点 | L1 人数 | L2 人数 |
|
||||
|------|---------|---------|
|
||||
| L01 | 1,893 | 3,021 |
|
||||
| L02 | 1,089 | 1,222 |
|
||||
| L03 | 554 | 784 |
|
||||
| L04 | 430 | 507 |
|
||||
|
||||
L01+L02 合计:L1 约 3,000 人、L2 约 4,200 人。这是峰值前加微的最大目标池。
|
||||
|
||||
---
|
||||
|
||||
## 基于数据的策略建议
|
||||
|
||||
### 售卖弹窗
|
||||
|
||||
数据不支持存在单一"体验峰值"——转化率是平滑递增的。但有两个自然决策窗口:
|
||||
|
||||
- **L03-L04 之间**:转化率已达 40-50%,用户已投入足够学习成本,决策成本降低
|
||||
- **L05 完成后**:转化率最高(48-55%),且是 U00→U1 付费墙前的最后节点
|
||||
|
||||
建议:L04 完成时触发售卖弹窗,L05 完成后再次触发。
|
||||
|
||||
### 加微弹窗
|
||||
|
||||
| 时机 | 目标池规模 | 理由 |
|
||||
|------|-----------|------|
|
||||
| **L01 完成后 + 24h 未回访** | L1: ~1,900 / L2: ~3,000 | 首课即弃,最大流失口 |
|
||||
| **L02 完成后** | L1: ~1,100 / L2: ~1,200 | 第二波流失,仍有挽回空间 |
|
||||
| **L05 完成后 + N 天未购买** | **3,137 人** | 完课未购,加微转化的最高优先级 |
|
||||
|
||||
你看这个结论是否合理?要不要我进一步加细某个维度的分析?
|
||||
user: 我们只关注端内购课的用户,非端内购课的用户剔除掉,不分析。另外我还想看看用户从注册到购课的天数分布。
|
||||
assistant: 明白,端内过滤 + 新增注册→购课天数分布。我把脚本全量更新重跑。
|
||||
assistant: 端内付费只有332人,这个数量级很有意思。修复分段统计的SQL问题继续跑。
|
||||
assistant: 分析零跑出来了。分析一有SQL语法问题,修复一下。
|
||||
assistant: 分析一结果很清楚了——**端内付费用户 89.7% 在 L05 之后购买**。修复分析二的字段问题。
|
||||
assistant: 全部跑完了,以下是修正后的完整结果:
|
||||
|
||||
---
|
||||
|
||||
## 📊 弹窗策略数据分析(仅端内付费用户)
|
||||
|
||||
### 分析零:注册→购课天数分布
|
||||
|
||||
| 时间段 | 人数 | 占比 | 累计 |
|
||||
|--------|------|------|------|
|
||||
| 当天 | 20 | 6.1% | 6.1% |
|
||||
| 1-3天 | 79 | 24.1% | 30.1% |
|
||||
| 4-7天 | 78 | 23.8% | 53.9% |
|
||||
| 8-14天 | 54 | 16.5% | 70.3% |
|
||||
| 15-30天 | 43 | 13.1% | 83.4% |
|
||||
| 30天以上 | 54 | 16.5% | 100% |
|
||||
|
||||
端内付费用户总计 **332 人**,中位数约在第 7 天,70% 在 2 周内完成购买。
|
||||
|
||||
---
|
||||
|
||||
### 分析一:购买节点分布
|
||||
|
||||
| 购买节点 | 用户数 | 占比 |
|
||||
|----------|--------|------|
|
||||
| 学习前购买 | 7 | 2.2% |
|
||||
| L01之后 | 8 | 2.5% |
|
||||
| L02之后 | 9 | 2.8% |
|
||||
| L03之后 | 5 | 1.6% |
|
||||
| L04之后 | 4 | 1.2% |
|
||||
| **L05之后** | **288** | **89.7%** |
|
||||
|
||||
**这是最关键的发现。** 端内付费用户的购买行为高度集中在 L05(学完 U00 全部 5 课)之后,占比近 90%。L01-L04 中途购买的合计只有 26 人(8.1%)。
|
||||
|
||||
按产品拆分也完全一致:L1 的 99/107 人、L2 的 184/199 人在 L05 之后购买。
|
||||
|
||||
---
|
||||
|
||||
### 分析二 & 三:断点用户规模 + 端内付费率
|
||||
|
||||
| 断点 | L1 人数 | L1 付费率 | L2 人数 | L2 付费率 |
|
||||
|------|---------|----------|---------|----------|
|
||||
| L01 | 1,893 | 0.3% | 3,021 | 0.2% |
|
||||
| L02 | 1,089 | 0.6% | 1,222 | 0.3% |
|
||||
| L03 | 554 | 0.5% | 784 | 0.5% |
|
||||
| L04 | 430 | 0.5% | 507 | 0.8% |
|
||||
| L05 | 3,109 | 4.4% | 3,606 | 5.5% |
|
||||
|
||||
学完 U00 全部 5 课的总用户 6,166 人,其中端内付费仅 306 人(5.0%),**未付费 5,860 人(95%)**。
|
||||
|
||||
---
|
||||
|
||||
## 与之前的对比(端内 vs 全量)
|
||||
|
||||
过滤端内后,结论发生了根本性变化:
|
||||
|
||||
| 维度 | 全量数据 | 仅端内 |
|
||||
|------|---------|--------|
|
||||
| 付费用户量 | ~4,900 | **332** |
|
||||
| 学习前购买占比 | 31.5% | **2.2%** |
|
||||
| L05之后购买占比 | 27.5% | **89.7%** |
|
||||
| 学完U00后的付费率 | 49.1% | **5.0%** |
|
||||
|
||||
全量数据中的"学习前购买"大头全是端外渠道(销售/达人/店铺等),端内用户行为模式完全不同——**他们几乎全部是学完 U00 再买**。
|
||||
|
||||
---
|
||||
|
||||
## 修正后的策略结论
|
||||
|
||||
### 售卖弹窗
|
||||
|
||||
**只有一个最佳位置:L05 完成后。** 在 L01-L04 弹售卖弹窗基本没有意义——端内用户不会在中途下单。
|
||||
|
||||
### 加微弹窗
|
||||
|
||||
**核心目标池:5,860 个学完 U00 但未端内付费的用户。**
|
||||
|
||||
加微弹窗应该分两层:
|
||||
1. **L01-L02 之间(峰值前/学习中)**:L1~3000人 + L2~4200人停留在这个阶段。付费率极低(0.2-0.6%),但规模大,加微可以挽留、引导继续学习
|
||||
2. **L05 完成后(峰值后)**:5,860 人学完了但没买,这是最大的加微转化机会。结合注册→购课天数数据,如果在 L05 完成后 7 天内未购买,应该触发加微弹窗
|
||||
|
||||
### 策略矩阵
|
||||
|
||||
| 位置 | 弹窗类型 | 目标人数 | 时机 |
|
||||
|------|---------|---------|------|
|
||||
| L01 完成+24h未继续 | 加微 | L1~1900 / L2~3000 | 首课流失挽回 |
|
||||
| L02 完成 | 加微 | L1~1100 / L2~1200 | 学习激励 |
|
||||
| **L05 完成** | **售卖** | **6,166** | 体验终点,转化窗口 |
|
||||
| L05 完成后+3天未购买 | 加微 | ~5,860 | 完课未购,销售跟进 |
|
||||
|
||||
这个结论和之前的分析差异很大——你的端内过滤条件改变了全部分析结果。你看这个方向对吗?
|
||||
105
memory/2026-05-29-1431.md
Normal file
105
memory/2026-05-29-1431.md
Normal file
@ -0,0 +1,105 @@
|
||||
# Session: 2026-05-29 14:31:30 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: 2736401e-9880-4eb8-a80c-4cd4fe0be792
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
user: 通过account_detail_info这个表里面的login_address帮我判断一下,有多少未退费的付费用户的登录地址不在国内,剔除测试账号
|
||||
assistant: 好的,先看看 `login_address` 的实际数据分布,了解什么算"不在国内"。
|
||||
|
||||
|
||||
assistant: 这些地址看起来都是国内省份。让我查一下有没有海外地址(不以中国省份/直辖市开头的)。
|
||||
|
||||
|
||||
assistant: 现在让我计算核心指标——未退费付费用户中登录地址在海外的人数。
|
||||
|
||||
|
||||
assistant: 让我进一步看看这些海外用户的具体地址分布:
|
||||
|
||||
|
||||
assistant: 数据已查清。以下是结果:
|
||||
|
||||
---
|
||||
|
||||
## 未退费付费用户 · 海外登录地址分析
|
||||
|
||||
**核心结论:18 人登录地址不在中国大陆,占未退费付费用户总数(3,118人)的 0.6%。**
|
||||
|
||||
### 海外地址分布
|
||||
|
||||
| 登录地址 | 人数 |
|
||||
|----------|:---:|
|
||||
| 中国香港 | 4 |
|
||||
| 东京都(日本) | 1 |
|
||||
| 中国台湾-新竹县 | 1 |
|
||||
| 中国台湾-高雄市 | 1 |
|
||||
| 加利福尼亚州-洛杉矶(美国) | 1 |
|
||||
| 大阪府(日本) | 1 |
|
||||
| 庆尚北道(韩国) | 1 |
|
||||
| 攀牙府(泰国) | 1 |
|
||||
| 新南威尔士州(澳大利亚) | 1 |
|
||||
| 柏林(德国) | 1 |
|
||||
| 瓦伦西亚(西班牙) | 1 |
|
||||
| 莱茵兰-普法尔茨(德国) | 1 |
|
||||
| 西部大区-Lautoka(斐济) | 1 |
|
||||
| 下萨克森州-汉诺威(德国) | 1 |
|
||||
| 首尔(韩国) | 1 |
|
||||
|
||||
---
|
||||
|
||||
**口径说明:**
|
||||
- 国内省份范围:不含港澳台,仅中国大陆 31 个省/自治区/直辖市
|
||||
- 未退费付费用户:`order_status=3`(已完成)+ 非测试账号(`status=1, deleted_at IS NULL`)
|
||||
- 中国香港(4人)和中国台湾(2人)均按"海外"计算;如果调整口径,数值会变化
|
||||
|
||||
需要按其他口径(比如含港澳台、或只看严格意义上的海外非华人地区)重新统计吗?
|
||||
user: 我想看看这些用户的行课数据
|
||||
assistant: 数据很多,让我做一个按用户的汇总,包括总学习耗时。
|
||||
|
||||
|
||||
assistant: Now let me compute a clean summary with total learning time.
|
||||
|
||||
|
||||
assistant: 让我再查一下这些用户的付费等级和激活状态。
|
||||
|
||||
|
||||
assistant: 数据齐了,以下是汇总:
|
||||
|
||||
---
|
||||
|
||||
## 海外登录地址用户 · 行课数据总览(18人)
|
||||
|
||||
| 用户ID | 登录地址 | 付费等级 | 激活 | 最后学到 | 总学习(min) | 最近行课 |
|
||||
|:---:|---|---|:---:|---|:---:|---|
|
||||
| 1386 | 攀牙府-达瓜巴(泰国) | L2 | L2 | L2-S2-U19-L04 | 2,434.9 | 2026-04-30 |
|
||||
| 21237 | 庆尚北道(韩国) | L1+L2 | L1+L2 | L1-S1-U06-L05 | 933.6 | 2026-05-28 |
|
||||
| 2570 | 西部大区-Lautoka(斐济) | L2 | L2 | L2-S1-U04-L01 | 466.4 | 2026-03-07 |
|
||||
| 25341 | 中国香港 | L1+L2 | L1+L2 | L1-S1-U03-L03 | 446.4 | 2026-05-21 |
|
||||
| 4471 | 东京都-港区(日本) | L2 | L2 | L2-S1-U02-L02 | 397.1 | 2026-01-01 |
|
||||
| 24462 | 莱茵兰-普法尔茨(德国) | L1 | L1 | L1-S1-U05-L02 | 379.6 | 2026-05-23 |
|
||||
| 22678 | 首尔(韩国) | L2 | L2 | L2-S1-U03-L03 | 376.9 | 2026-05-10 |
|
||||
| 27417 | 洛杉矶(美国) | L2 | L2 | L2-S1-U02-L02 | 348.8 | 2026-05-21 |
|
||||
| 25943 | 柏林(德国) | L1+L2 | L1 | L1-S1-U03-L02 | 218.7 | 2026-05-09 |
|
||||
| 28268 | 大阪府(日本) | L1+L2 | L2 | L2-S0-U00-L03 | 201.4 | 2026-05-24 |
|
||||
| 27543 | 汉诺威(德国) | L1 | L1 | L1-S1-U01-L04 | 196.4 | 2026-05-27 |
|
||||
| 25000 | 中国香港 | L1+L2 | L1+L2 | L1-S1-U01-L04 | 183.2 | 2026-05-10 |
|
||||
| 27204 | 中国台湾-高雄 | L1 | L1 | L1-S1-U02-L01 | 179.0 | 2026-05-27 |
|
||||
| 24257 | 瓦伦西亚(西班牙) | L1+L2 | 未激活 | L1-S1-U01-L01 | 116.7 | 2026-05-01 |
|
||||
| 8735 | 悉尼(澳大利亚) | L1+L2 | L1 | L1-S0-U00-L03 | 102.3 | 2026-04-08 |
|
||||
| 22611 | 中国台湾-新竹 | L1+L2 | 未激活 | L1-S0-U00-L03 | 37.8 | 2026-04-11 |
|
||||
| 13609 | 中国香港 | L2 | L2 | L2-S0-U00-L01 | 24.9 | 2026-02-05 |
|
||||
| 22021 | 中国香港 | L1+L2 | L1+L2 | — | 0.0 | — |
|
||||
|
||||
---
|
||||
|
||||
**关键发现:**
|
||||
|
||||
- **18人中16人有行课记录**,仅 1 人(22021,香港)零行课
|
||||
- **最近仍在活跃(5月有行课)的有 12 人**,占 2/3,说明大部分海外用户并非僵尸号
|
||||
- **学习量最大**的是泰国用户(1386),L2 学到 S2-U19,累计 40+ 小时
|
||||
- **4人进度较浅**(≤L1-S1-U01 或更低),其中 22611(台湾)、24257(西班牙)已付费但未激活
|
||||
- **2人已沉默超过2个月**:4471(日本,最后 1/1)、13609(香港,最后 2/5)
|
||||
|
||||
需要进一步分析哪些维度吗?比如付费金额、订单时间分布等。
|
||||
90
memory/2026-05-29-1557.md
Normal file
90
memory/2026-05-29-1557.md
Normal file
@ -0,0 +1,90 @@
|
||||
# Session: 2026-05-29 15:57:11 GMT+8
|
||||
|
||||
- **Session Key**: agent:main:feishu:direct:ou_e63ce6b760ad39382852472f28fbe2a2
|
||||
- **Session ID**: 54b730fa-d68e-45d5-8ef8-5f740482a430
|
||||
- **Source**: feishu
|
||||
|
||||
## Conversation Summary
|
||||
|
||||
assistant: 已执行 ✅
|
||||
|
||||
- **明细数据:** 4 条记录,9 个字段
|
||||
- **按日期去重:** 1 行(目前只有 1 个日期有数据)
|
||||
- **按source_l3去重:** 4 行(4 个不同的 source_l3)
|
||||
|
||||
表格已更新:https://makee-interactive.feishu.cn/sheets/TWeHsCX6JhhsjntBagHcoJcJned
|
||||
user: 确认一下,你有没有vala_class库的stride_contact_bindings表的权限
|
||||
assistant: 有权限 ✅
|
||||
|
||||
`vala_class.stride_contact_bindings` 表,共 12 个字段:
|
||||
|
||||
| 字段 |
|
||||
|------|
|
||||
| id |
|
||||
| external_user_id |
|
||||
| bot_wxid |
|
||||
| vala_account_id |
|
||||
| lesson_status |
|
||||
| pay_status |
|
||||
| pay_status_synced_at |
|
||||
| created_at |
|
||||
| updated_at |
|
||||
| tel_encrypt |
|
||||
| last_login_time |
|
||||
| usage_duration |
|
||||
|
||||
可以正常读取。需要我做什么?
|
||||
user: 那么在update_stride_phone_lead_sheet.py里,我希望通过vala_class库中的表stride_phone_lead的phone_encrypt关联vala_class库的stride_contact_bindings表的tel_encrypt,并在sheet1中添加一列“是否加微”,匹配上的为是,未匹配上为否。另外在sheet2和sheet3中都增加一列“已加微”数量的统计。
|
||||
assistant: Now I'll modify the script to add the "是否加微" / "已加微" columns using the `tel_encrypt` join between the two tables.
|
||||
|
||||
|
||||
assistant: 改好了 ✅ 已更新表格。
|
||||
|
||||
改动内容:
|
||||
|
||||
| Sheet | 新增列 | 说明 |
|
||||
|-------|--------|------|
|
||||
| **Sheet1 明细** | `是否加微` | `phone_encrypt` ↔ `tel_encrypt` 匹配,命中=是,未命中=否 |
|
||||
| **Sheet2 按日期去重** | `已加微` | 按日期统计去重 account_id 中已加微的数量 |
|
||||
| **Sheet3 按source_l3去重** | `已加微` | 按 source_l3 统计去重 account_id 中已加微的数量 |
|
||||
|
||||
当前 `stride_contact_bindings` 中已有 **1892** 个不同的 tel_encrypt,关联匹配正常。
|
||||
user: 接下来,对于sheet3,如果某一天的去重account_id数为0,也要在表格里面加上这一天的日期,并填上值为0
|
||||
assistant: 你说的是 Sheet2(按日期去重)吧?Sheet3 是按 source_l3 维度,没有日期概念。我先按 Sheet2 来补齐日期范围。
|
||||
|
||||
|
||||
assistant: 已更新 ✅ Sheet2 现在会从最早日期到最晚日期补全所有日期,无数据的日期填 0。
|
||||
|
||||
当前数据只有 2026-05-28 一天,所以看起来没变化。等数据积累起来后,中间空白的日期就会自动补 0 了。
|
||||
user: 你确认一下表格里面各个sheet的顺序
|
||||
assistant: 当前表格中 Sheet 顺序如下:
|
||||
|
||||
| 位置 | 标题 | sheet_id |
|
||||
|------|------|----------|
|
||||
| 1 | 明细数据 | `2dBSrS` |
|
||||
| 2 | 按source_l3去重 | `1X5g5z` |
|
||||
| 3 | 按日期去重 | `1X5g5y` |
|
||||
|
||||
**按日期去重** 和 **按source_l3去重** 的位置是反的——按日期去重跑到了最后一位。脚本写入不受影响(按 sheet_id 精确写入),只是表格里显示的顺序不对。
|
||||
|
||||
需要我把顺序调成「明细数据 → 按日期去重 → 按source_l3去重」吗?
|
||||
user: 需要表格与脚本一致
|
||||
assistant: 新 sheet 总是插到 index 0,所以我需要全部删除后按逆序重建。
|
||||
|
||||
|
||||
assistant: Sheet 顺序已修正。现在更新脚本中的 sheet_id 并执行一次数据回填:
|
||||
|
||||
|
||||
assistant: 搞定 ✅ 表格与脚本已一致。
|
||||
|
||||
调整过程:飞书 API 的 moveSheet 接口一直报 90204,最终通过删除全部 Sheet 后按逆序重建的方式实现了正确的排列顺序。
|
||||
|
||||
现在的状态:
|
||||
|
||||
| 位置 | Sheet | sheet_id |
|
||||
|------|-------|----------|
|
||||
| 1 | 明细数据 | `1uDxsI` |
|
||||
| 2 | 按日期去重 | `1uxp3G` |
|
||||
| 3 | 按source_l3去重 | `1urKus` |
|
||||
|
||||
脚本中的 sheet_id 已同步更新,数据已回填验证通过。
|
||||
373
scripts/auto_xingke_query.py
Normal file
373
scripts/auto_xingke_query.py
Normal file
@ -0,0 +1,373 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
行课查询自动回填 — 从飞书表格读取待处理数据,匹配手机号→ID、查询行课记录并回填
|
||||
执行频率:每30分钟 cron 巡检 + 群聊关键词触发
|
||||
归属 Agent:小溪 (xiaoxi)
|
||||
|
||||
流程:
|
||||
1. 读取 Sheet 2DOxEI → 找出"待查询ID"行 → 手机号脱敏匹配 account_id → 回填 F/G/H 列
|
||||
2. 读取 Sheet 55b0eb → 找出"待查询"行 → 查行课记录(进度/耗时/付费) → 回填 D/E 列
|
||||
"""
|
||||
import json, requests, os, re, psycopg2, sys
|
||||
from datetime import datetime
|
||||
from collections import defaultdict
|
||||
|
||||
# ── 配置 ──
|
||||
PG_HOST = "bj-postgres-16pob4sg.sql.tencentcdb.com"
|
||||
PG_PORT = 28591
|
||||
PG_USER = "ai_member"
|
||||
PG_DB = "vala_bi"
|
||||
|
||||
SPREADSHEET_TOKEN = "RFIJsXT8FhGHhctY4RwczcOfnac"
|
||||
CRED_DIR = "/root/.openclaw/credentials/xiaoxi"
|
||||
|
||||
SHEET_ID_QUERY = "2DOxEI" # 手机号→ID 匹配
|
||||
SHEET_COURSE = "55b0eb" # 行课记录查询
|
||||
|
||||
LOG_FILE = "/var/log/xiaoxi_xingke_query.log"
|
||||
|
||||
def log(msg):
|
||||
ts = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
line = f"[{ts}] {msg}"
|
||||
print(line)
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(line + "\n")
|
||||
|
||||
def get_pg_password():
|
||||
secrets_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "secrets.env")
|
||||
with open(secrets_path) as f:
|
||||
for line in f:
|
||||
if line.startswith("PG_ONLINE_PASSWORD="):
|
||||
return line.strip().split("=", 1)[1].strip("'\"")
|
||||
|
||||
def get_fs_token():
|
||||
with open(os.path.join(CRED_DIR, "config.json")) as f:
|
||||
cfg = json.load(f)
|
||||
resp = requests.post(
|
||||
"https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal",
|
||||
json={"app_id": cfg["apps"][0]["appId"], "app_secret": cfg["apps"][0]["appSecret"]},
|
||||
timeout=15
|
||||
)
|
||||
return resp.json()["tenant_access_token"]
|
||||
|
||||
def read_sheet(token, sheet_id):
|
||||
url = f"https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{SPREADSHEET_TOKEN}/values/{sheet_id}"
|
||||
resp = requests.get(url, headers={"Authorization": f"Bearer {token}"}, timeout=30)
|
||||
data = resp.json()
|
||||
if data.get("code") != 0:
|
||||
raise RuntimeError(f"读取Sheet失败: {data}")
|
||||
return data["data"]["valueRange"]["values"]
|
||||
|
||||
def put_values(token, sheet_id, range_str, values):
|
||||
url = f"https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{SPREADSHEET_TOKEN}/values"
|
||||
body = {"valueRange": {"range": f"{sheet_id}!{range_str}", "values": values}}
|
||||
resp = requests.put(url, headers={
|
||||
"Authorization": f"Bearer {token}",
|
||||
"Content-Type": "application/json"
|
||||
}, json=body, timeout=30)
|
||||
return resp.json()
|
||||
|
||||
def mask_phone(phone):
|
||||
phone = str(phone).strip()
|
||||
if "." in phone:
|
||||
parts = phone.split(".")
|
||||
if parts[1] in ("0", "00"):
|
||||
phone = parts[0]
|
||||
if re.match(r"^1\d{10}$", phone):
|
||||
return f"{phone[:3]}****{phone[-4:]}"
|
||||
return None
|
||||
|
||||
def batch_in(cur, sql_tpl, params, chunk=500):
|
||||
results = []
|
||||
for i in range(0, len(params), chunk):
|
||||
batch = params[i:i+chunk]
|
||||
ph = ",".join(["%s"] * len(batch))
|
||||
cur.execute(sql_tpl % ph, batch)
|
||||
results.extend(cur.fetchall())
|
||||
return results
|
||||
|
||||
# ── Step 1: 手机号→ID 匹配 ──
|
||||
def process_id_matching(token, conn):
|
||||
log("Step 1: 手机号→ID 匹配")
|
||||
rows = read_sheet(token, SHEET_ID_QUERY)
|
||||
pending = []
|
||||
for idx, row in enumerate(rows[1:], start=1):
|
||||
if len(row) > 4 and str(row[4]).strip() == "待查询ID":
|
||||
phone = str(row[1]).strip() if len(row) > 1 and row[1] else ""
|
||||
pending.append({"row_idx": idx + 1, "phone": phone})
|
||||
|
||||
log(f" 待查询ID: {len(pending)}")
|
||||
if not pending:
|
||||
log(" 无待处理, 跳过")
|
||||
return {"processed": 0, "matched": 0}
|
||||
|
||||
phones_raw = list(set(r["phone"] for r in pending if r["phone"]))
|
||||
valid_phones = [(p, mask_phone(p)) for p in phones_raw if mask_phone(p)]
|
||||
masks = list(set(m[1] for m in valid_phones))
|
||||
|
||||
cur = conn.cursor()
|
||||
masked_to_aid = {}
|
||||
for i in range(0, len(masks), 500):
|
||||
batch = masks[i:i+500]
|
||||
ph = ",".join(["%s"] * len(batch))
|
||||
cur.execute(
|
||||
f"SELECT id, tel FROM bi_vala_app_account WHERE tel IN ({ph}) AND status=1 AND deleted_at IS NULL",
|
||||
batch
|
||||
)
|
||||
for aid, tel in cur.fetchall():
|
||||
if tel not in masked_to_aid:
|
||||
masked_to_aid[tel] = aid
|
||||
cur.close()
|
||||
|
||||
phone_to_aid = {}
|
||||
for phone, m in valid_phones:
|
||||
if m in masked_to_aid:
|
||||
phone_to_aid[phone] = masked_to_aid[m]
|
||||
|
||||
log(f" 匹配成功: {len(phone_to_aid)}, 未匹配: {len(valid_phones)-len(phone_to_aid)}")
|
||||
|
||||
# 组装回填数据
|
||||
now_str = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
results = []
|
||||
for r in pending:
|
||||
phone = r["phone"]
|
||||
if phone in phone_to_aid:
|
||||
results.append({"row": r["row_idx"], "user_id": str(phone_to_aid[phone]), "status": "已回填"})
|
||||
else:
|
||||
results.append({"row": r["row_idx"], "user_id": "", "status": "未查到"})
|
||||
|
||||
results.sort(key=lambda x: x["row"])
|
||||
|
||||
# 分批写入 (按 contiguous rows 分组)
|
||||
groups = []
|
||||
cur_grp = []
|
||||
for r in results:
|
||||
if not cur_grp or r["row"] == cur_grp[-1]["row"] + 1:
|
||||
cur_grp.append(r)
|
||||
else:
|
||||
groups.append(cur_grp)
|
||||
cur_grp = [r]
|
||||
if cur_grp:
|
||||
groups.append(cur_grp)
|
||||
|
||||
for g in groups:
|
||||
sr, er = g[0]["row"], g[-1]["row"]
|
||||
f_vals = [[r["user_id"]] for r in g]
|
||||
g_vals = [[r["status"]] for r in g]
|
||||
h_vals = [[now_str] for r in g]
|
||||
put_values(token, SHEET_ID_QUERY, f"F{sr}:F{er}", f_vals)
|
||||
put_values(token, SHEET_ID_QUERY, f"G{sr}:G{er}", g_vals)
|
||||
put_values(token, SHEET_ID_QUERY, f"H{sr}:H{er}", h_vals)
|
||||
# 将"请求状态"列 E 标记为已处理
|
||||
e_vals = [["已处理"] for _ in g]
|
||||
put_values(token, SHEET_ID_QUERY, f"E{sr}:E{er}", e_vals)
|
||||
|
||||
log(f" ID回填完成: {len(results)} 行, 匹配{len(phone_to_aid)}")
|
||||
return {"processed": len(results), "matched": len(phone_to_aid)}
|
||||
|
||||
|
||||
# ── Step 2: 行课记录查询 ──
|
||||
def process_course_records(token, conn):
|
||||
log("Step 2: 行课记录查询")
|
||||
rows = read_sheet(token, SHEET_COURSE)
|
||||
pending = []
|
||||
for idx, row in enumerate(rows[1:], start=1):
|
||||
if len(row) > 2 and str(row[2]).strip() == "待查询":
|
||||
uid = str(row[1]).strip() if len(row) > 1 and row[1] else ""
|
||||
sales = str(row[0]).strip() if len(row) > 0 and row[0] else ""
|
||||
pending.append({"row_idx": idx + 1, "user_id": uid, "sales": sales})
|
||||
|
||||
log(f" 待查询: {len(pending)}")
|
||||
if not pending:
|
||||
log(" 无待处理, 跳过")
|
||||
return {"processed": 0, "with_records": 0}
|
||||
|
||||
# 解析有效 user_id
|
||||
valid = []
|
||||
for r in pending:
|
||||
try:
|
||||
aid = int(float(r["user_id"]))
|
||||
if aid > 0:
|
||||
r["account_id"] = aid
|
||||
valid.append(r)
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
log(f" 有效用户ID: {len(valid)}")
|
||||
if not valid:
|
||||
return {"processed": 0, "with_records": 0}
|
||||
|
||||
uid_set = list(set(r["account_id"] for r in valid))
|
||||
|
||||
# 获取角色
|
||||
cur = conn.cursor()
|
||||
account_chars = defaultdict(list)
|
||||
char_to_account = {}
|
||||
rc = batch_in(cur,
|
||||
"SELECT account_id, id, nickname FROM bi_vala_app_character WHERE account_id IN (%s) AND nickname IS NOT NULL AND nickname != '' AND deleted_at IS NULL",
|
||||
uid_set
|
||||
)
|
||||
for aid, cid, nick in rc:
|
||||
account_chars[aid].append(cid)
|
||||
char_to_account[cid] = aid
|
||||
char_ids = list(char_to_account.keys())
|
||||
|
||||
# 课程映射
|
||||
cur.execute("SELECT id, course_level, course_season, course_unit, course_lesson FROM bi_level_unit_lesson")
|
||||
chapter_map = {}
|
||||
for ch_id, cl, cs, cu, cl2 in cur.fetchall():
|
||||
chapter_map[ch_id] = (cl or "", cs or "", cu or "", cl2 or "")
|
||||
|
||||
# 课时完成记录
|
||||
char_plays = defaultdict(lambda: {"latest_time": None, "latest_chapter": None, "total_ms": 0})
|
||||
for tbl_idx in range(8):
|
||||
table = f"bi_user_chapter_play_record_{tbl_idx}"
|
||||
try:
|
||||
cur.execute(
|
||||
f"SELECT user_id, chapter_id, created_at FROM {table} WHERE play_status=1 AND deleted_at IS NULL AND user_id = ANY(%s)",
|
||||
(char_ids,)
|
||||
)
|
||||
for uid, ch_id, created_at in cur.fetchall():
|
||||
ch_data = chapter_map.get(ch_id)
|
||||
if ch_data:
|
||||
rec = char_plays[uid]
|
||||
if rec["latest_time"] is None or created_at > rec["latest_time"]:
|
||||
rec["latest_time"] = created_at
|
||||
rec["latest_chapter"] = (ch_id, ch_data)
|
||||
except Exception as e:
|
||||
log(f" 警告 {table}: {e}")
|
||||
|
||||
# 学习总耗时
|
||||
for tbl_idx in range(8):
|
||||
table = f"bi_user_component_play_record_{tbl_idx}"
|
||||
try:
|
||||
cur.execute(
|
||||
f"SELECT user_id, SUM(COALESCE(interval_time,0)) FROM {table} WHERE user_id = ANY(%s) AND deleted_at IS NULL GROUP BY user_id",
|
||||
(char_ids,)
|
||||
)
|
||||
for uid, total_ms in cur.fetchall():
|
||||
if uid in char_plays:
|
||||
char_plays[uid]["total_ms"] += (total_ms or 0)
|
||||
except Exception as e:
|
||||
log(f" 警告 {table}: {e}")
|
||||
|
||||
# 付费状态
|
||||
ph = ",".join(["%s"] * len(uid_set))
|
||||
cur.execute(
|
||||
f"SELECT account_id, COUNT(*) FROM bi_vala_order WHERE account_id IN ({ph}) AND pay_success_date IS NOT NULL AND order_status=3 AND deleted_at IS NULL GROUP BY account_id",
|
||||
uid_set
|
||||
)
|
||||
paid = {r[0]: r[1] for r in cur.fetchall()}
|
||||
|
||||
# 激活状态
|
||||
try:
|
||||
cur.execute(
|
||||
f"SELECT t.account_id, t.season_package_level FROM bi_vala_seasonal_ticket t INNER JOIN bi_vala_app_account a ON t.account_id=a.id AND a.status=1 WHERE t.account_id IN ({ph}) AND t.status=1 AND t.deleted_at IS NULL AND t.season_package_level IN ('A1','A2')",
|
||||
uid_set
|
||||
)
|
||||
activation = {}
|
||||
for aid, lvl in cur.fetchall():
|
||||
if aid not in activation:
|
||||
activation[aid] = lvl
|
||||
except:
|
||||
activation = {}
|
||||
|
||||
cur.close()
|
||||
|
||||
# 组装结果
|
||||
now_str = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
results = []
|
||||
|
||||
for r in valid:
|
||||
aid = r["account_id"]
|
||||
chars = account_chars.get(aid, [])
|
||||
|
||||
best_time = None
|
||||
best_ch = None
|
||||
total_ms = 0
|
||||
|
||||
for cid in chars:
|
||||
play = char_plays.get(cid)
|
||||
if play and play["latest_chapter"]:
|
||||
if best_time is None or play["latest_time"] > best_time:
|
||||
best_time = play["latest_time"]
|
||||
best_ch = play["latest_chapter"]
|
||||
total_ms += play["total_ms"]
|
||||
|
||||
if not best_ch:
|
||||
p = "已付费" if aid in paid else "未付费"
|
||||
record = f"销售:{r['sales']} | 用户:{aid} | 当前:无行课记录 | 最近行课:无 | 学习0min | {p}"
|
||||
else:
|
||||
ch_id, (cl, cs, cu, cl2) = best_ch
|
||||
full_label = f"{cl}-{cs}-{cu}-{cl2}"
|
||||
act = activation.get(aid, "")
|
||||
prefix = f"{act}体验课" if act else f"{cl}体验课"
|
||||
current = f"{prefix}-{full_label}"
|
||||
recent = best_time.strftime("%Y-%m-%d") if best_time else "无"
|
||||
total_min = round(total_ms / 60000, 1)
|
||||
if total_min == int(total_min):
|
||||
total_min = int(total_min)
|
||||
p = "已付费" if aid in paid else "未付费"
|
||||
record = f"销售:{r['sales']} | 用户:{aid} | 当前:{current} | 最近行课:{recent} | 学习{total_min}min | {p}"
|
||||
|
||||
results.append({"row": r["row_idx"], "record": record})
|
||||
|
||||
# 回填
|
||||
results.sort(key=lambda x: x["row"])
|
||||
groups = []
|
||||
cur_grp = []
|
||||
for r in results:
|
||||
if not cur_grp or r["row"] == cur_grp[-1]["row"] + 1:
|
||||
cur_grp.append(r)
|
||||
else:
|
||||
groups.append(cur_grp)
|
||||
cur_grp = [r]
|
||||
if cur_grp:
|
||||
groups.append(cur_grp)
|
||||
|
||||
for g in groups:
|
||||
sr, er = g[0]["row"], g[-1]["row"]
|
||||
d_vals = [[r["record"]] for r in g]
|
||||
e_vals = [[now_str] for r in g]
|
||||
put_values(token, SHEET_COURSE, f"D{sr}:D{er}", d_vals)
|
||||
put_values(token, SHEET_COURSE, f"E{sr}:E{er}", e_vals)
|
||||
|
||||
with_records = sum(1 for r in results if "无行课记录" not in r["record"])
|
||||
log(f" 行课回填完成: {len(results)} 行, 有记录: {with_records}")
|
||||
return {"processed": len(results), "with_records": with_records}
|
||||
|
||||
|
||||
# ── Main ──
|
||||
def main():
|
||||
log("=" * 50)
|
||||
log("行课查询自动回填 启动")
|
||||
|
||||
try:
|
||||
token = get_fs_token()
|
||||
conn = psycopg2.connect(
|
||||
host=PG_HOST, port=PG_PORT, user=PG_USER,
|
||||
password=get_pg_password(), dbname=PG_DB, connect_timeout=30
|
||||
)
|
||||
|
||||
r1 = process_id_matching(token, conn)
|
||||
r2 = process_course_records(token, conn)
|
||||
|
||||
conn.close()
|
||||
|
||||
total = r1["processed"] + r2["processed"]
|
||||
summary = f"ID回填:{r1['processed']}(匹配{r1['matched']}) | 行课:{r2['processed']}(有记录{r2['with_records']})"
|
||||
log(f"完成: {summary}")
|
||||
|
||||
if total == 0:
|
||||
log("无待处理任务, 静默退出")
|
||||
return 0
|
||||
return 0
|
||||
except Exception as e:
|
||||
log(f"ERROR: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
80
scripts/export_tel_encrypt_no_gsv.py
Normal file
80
scripts/export_tel_encrypt_no_gsv.py
Normal file
@ -0,0 +1,80 @@
|
||||
#!/usr/bin/env python3
|
||||
"""导出 GSV=0 用户的 tel_encrypt 字段到 Excel"""
|
||||
import os
|
||||
import sys
|
||||
sys.path.insert(0, '/root/.openclaw/workspace')
|
||||
|
||||
import psycopg2
|
||||
import pandas as pd
|
||||
from openpyxl import Workbook
|
||||
from datetime import datetime
|
||||
|
||||
# 读取密码
|
||||
secrets = {}
|
||||
with open('/root/.openclaw/workspace/secrets.env') as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line and not line.startswith('#') and '=' in line:
|
||||
k, v = line.split('=', 1)
|
||||
secrets[k.strip()] = v.strip().strip('"').strip("'")
|
||||
|
||||
conn = psycopg2.connect(
|
||||
host='bj-postgres-16pob4sg.sql.tencentcdb.com',
|
||||
port=28591,
|
||||
user='ai_member',
|
||||
password=secrets['PG_ONLINE_PASSWORD'],
|
||||
dbname='vala_bi'
|
||||
)
|
||||
|
||||
# 查询:所有 status=1 且未删除的账号中,GSV=0(没有有效订单)的用户
|
||||
# GSV=0 定义:没有任何 order_status=3(已完成)的订单
|
||||
# 包括:从未下单的用户 + 下单但全部退款的用户
|
||||
sql = """
|
||||
SELECT
|
||||
a.tel_encrypt
|
||||
FROM bi_vala_app_account a
|
||||
LEFT JOIN (
|
||||
SELECT
|
||||
account_id,
|
||||
COUNT(*) FILTER (WHERE order_status IN (3, 4)) AS total_orders,
|
||||
COUNT(*) FILTER (WHERE order_status = 3) AS completed_orders
|
||||
FROM bi_vala_order
|
||||
GROUP BY account_id
|
||||
) uos ON uos.account_id = a.id
|
||||
WHERE a.status = 1
|
||||
AND a.deleted_at IS NULL
|
||||
AND (uos.completed_orders IS NULL OR uos.completed_orders = 0)
|
||||
ORDER BY a.id
|
||||
"""
|
||||
|
||||
print("执行查询...")
|
||||
df = pd.read_sql(sql, conn)
|
||||
conn.close()
|
||||
|
||||
print(f"查询完成,共 {len(df)} 条记录")
|
||||
|
||||
# 写入 Excel
|
||||
output_dir = '/root/.openclaw/workspace/output'
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
output_path = os.path.join(output_dir, f'端内注册未成交用户加密手机号_{datetime.now().strftime("%Y%m%d_%H%M%S")}.xlsx')
|
||||
|
||||
wb = Workbook()
|
||||
ws = wb.active
|
||||
ws.title = "GSV=0用户"
|
||||
|
||||
# 表头
|
||||
headers = ['tel_encrypt']
|
||||
ws.append(headers)
|
||||
|
||||
# 数据
|
||||
for _, row in df.iterrows():
|
||||
ws.append([row['tel_encrypt']])
|
||||
|
||||
# 列宽
|
||||
ws.column_dimensions['A'].width = 30
|
||||
|
||||
# 统计 sheet
|
||||
|
||||
|
||||
wb.save(output_path)
|
||||
print(f"已导出到: {output_path}, 共 {len(df)} 人")
|
||||
365
scripts/refresh_xhs_user_table.py
Normal file
365
scripts/refresh_xhs_user_table.py
Normal file
@ -0,0 +1,365 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
小红书用户行课全量表 — 每日刷新
|
||||
执行频率:每天 6:00 AM
|
||||
归属 Agent:小溪 (xiaoxi)
|
||||
|
||||
数据来源:bi_vala_app_account + bi_vala_app_character + bi_user_chapter_play_record (8分表)
|
||||
+ bi_user_component_play_record (8分表) + bi_vala_order + bi_refund_order
|
||||
+ bi_user_course_detail + bi_vala_seasonal_ticket
|
||||
|
||||
筛选条件:从飞书表格 Yzs0sPw2KhZ03gtKmuuctZ5LnJf (Sheet 75934f) 读取 uid 清单,按 uid 出全量表
|
||||
不再按订单渠道/付费状态筛选,所有 uid 一律纳入
|
||||
输出:飞书表格 Af1psbiYphO5N0txTkAcnJnInmc (Sheet 630066)
|
||||
"""
|
||||
|
||||
import json, requests, os, psycopg2, sys
|
||||
from datetime import datetime
|
||||
from collections import defaultdict
|
||||
from psycopg2.extras import execute_values
|
||||
|
||||
# ── 配置 ──
|
||||
PG_HOST = "bj-postgres-16pob4sg.sql.tencentcdb.com"
|
||||
PG_PORT = 28591
|
||||
PG_USER = "ai_member"
|
||||
PG_DB = "vala_bi"
|
||||
CRED_DIR = "/root/.openclaw/credentials/xiaoxi"
|
||||
SPREADSHEET_TOKEN = "Af1psbiYphO5N0txTkAcnJnInmc"
|
||||
SHEET_ID = "630066"
|
||||
LOG_FILE = "/var/log/xiaoxi_xhs_table_refresh.log"
|
||||
|
||||
HEADERS = [
|
||||
'用户ID(account_id)', '我方脱敏手机号(tel)', '对方脱敏手机号', '销售归属', '进线日期',
|
||||
'注册时间', '渠道来源', '角色昵称', '课程等级', '课程类型',
|
||||
'首课时间', '首课章节',
|
||||
'体验L01时间', '体验L02时间', '体验L03时间', '体验L04时间', '体验L05时间',
|
||||
'已完体验节数', '最近行课时间', '最近行课章节',
|
||||
'学习总时长(分钟)', '付费状态', '首次付费时间',
|
||||
'总GMV', '总GSV', '付费渠道', '最近登录时间'
|
||||
]
|
||||
|
||||
|
||||
def log(msg):
|
||||
ts = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
line = f"[{ts}] {msg}"
|
||||
print(line)
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(line + "\n")
|
||||
|
||||
|
||||
def get_pg_password():
|
||||
secrets_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "secrets.env")
|
||||
with open(secrets_path) as f:
|
||||
for line in f:
|
||||
if line.startswith("PG_ONLINE_PASSWORD="):
|
||||
return line.strip().split("=", 1)[1].strip("'\"")
|
||||
|
||||
|
||||
def get_fs_token():
|
||||
with open(os.path.join(CRED_DIR, "config.json")) as f:
|
||||
cfg = json.load(f)
|
||||
resp = requests.post(
|
||||
"https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal",
|
||||
json={"app_id": cfg["apps"][0]["appId"], "app_secret": cfg["apps"][0]["appSecret"]},
|
||||
timeout=15
|
||||
)
|
||||
return resp.json()["tenant_access_token"]
|
||||
|
||||
|
||||
def fmt(val):
|
||||
if val is None:
|
||||
return ''
|
||||
if isinstance(val, datetime):
|
||||
return val.strftime('%Y-%m-%d %H:%M:%S')
|
||||
return str(val)
|
||||
|
||||
|
||||
def main():
|
||||
log("=" * 50)
|
||||
log("小红书用户行课全量表刷新 启动")
|
||||
|
||||
try:
|
||||
conn = psycopg2.connect(
|
||||
host=PG_HOST, port=PG_PORT, user=PG_USER,
|
||||
password=get_pg_password(), dbname=PG_DB, connect_timeout=60
|
||||
)
|
||||
cur = conn.cursor()
|
||||
|
||||
# Step 1: Read uid list from Feishu sheet (user-maintained)
|
||||
log("Step 1: 读取 uid 清单")
|
||||
UID_SHEET_TOKEN = 'Yzs0sPw2KhZ03gtKmuuctZ5LnJf'
|
||||
UID_SHEET_ID = '75934f'
|
||||
resp = requests.get(
|
||||
f'https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{UID_SHEET_TOKEN}/values/{UID_SHEET_ID}',
|
||||
headers={'Authorization': f'Bearer {token}'}, timeout=30
|
||||
)
|
||||
uid_values = resp.json()['data']['valueRange']['values']
|
||||
aids = []
|
||||
uid_meta = {} # aid -> {their_phone, sales, lead_date}
|
||||
for row in uid_values[1:]:
|
||||
if row and row[0]:
|
||||
try:
|
||||
aid = int(str(row[0]).strip())
|
||||
aids.append(aid)
|
||||
uid_meta[aid] = {
|
||||
'their_phone': row[1] if len(row) > 1 else '',
|
||||
'sales': row[2] if len(row) > 2 else '',
|
||||
'lead_date': row[3] if len(row) > 3 else '',
|
||||
}
|
||||
except:
|
||||
pass
|
||||
aids = sorted(set(aids))
|
||||
log(f" uid 清单: {len(aids)} 个")
|
||||
|
||||
if not aids:
|
||||
log("无小红书用户,退出")
|
||||
return 0
|
||||
|
||||
# Temp table
|
||||
cur.execute("CREATE TEMP TABLE _tmp_xhs_aids (id int PRIMARY KEY) ON COMMIT DROP")
|
||||
execute_values(cur, "INSERT INTO _tmp_xhs_aids (id) VALUES %s", [(a,) for a in aids])
|
||||
|
||||
# Step 2: Account info
|
||||
log("Step 2: 账户信息")
|
||||
cur.execute("""
|
||||
SELECT id, tel, created_at, download_channel
|
||||
FROM bi_vala_app_account
|
||||
WHERE id IN (SELECT id FROM _tmp_xhs_aids) AND status = 1 AND deleted_at IS NULL
|
||||
""")
|
||||
accounts = {}
|
||||
for aid, tel, cat, ch in cur.fetchall():
|
||||
accounts[aid] = {'tel': tel or '', 'reg_time': cat, 'channel': ch or ''}
|
||||
|
||||
# Step 3: Characters
|
||||
log("Step 3: 角色信息")
|
||||
cur.execute("""
|
||||
SELECT account_id, id, nickname, latest_login
|
||||
FROM bi_vala_app_character
|
||||
WHERE account_id IN (SELECT id FROM _tmp_xhs_aids)
|
||||
AND nickname IS NOT NULL AND nickname != '' AND deleted_at IS NULL
|
||||
""")
|
||||
account_chars = defaultdict(list)
|
||||
account_nicknames = defaultdict(list)
|
||||
latest_login = {}
|
||||
all_char_ids = []
|
||||
for aid, cid, nick, ll in cur.fetchall():
|
||||
account_chars[aid].append(cid)
|
||||
account_nicknames[aid].append(nick)
|
||||
all_char_ids.append(cid)
|
||||
if ll:
|
||||
if aid not in latest_login or ll > latest_login[aid]:
|
||||
latest_login[aid] = ll
|
||||
|
||||
# Step 4: Chapter plays
|
||||
log("Step 4: 课时记录")
|
||||
cur.execute("CREATE TEMP TABLE _tmp_xhs_cids (id int PRIMARY KEY) ON COMMIT DROP")
|
||||
execute_values(cur, "INSERT INTO _tmp_xhs_cids (id) VALUES %s", [(c,) for c in all_char_ids])
|
||||
|
||||
char_plays = defaultdict(list)
|
||||
for tbl_idx in range(8):
|
||||
table = f"bi_user_chapter_play_record_{tbl_idx}"
|
||||
try:
|
||||
cur.execute(f"""
|
||||
SELECT cpr.user_id, cpr.chapter_id, cpr.created_at
|
||||
FROM {table} cpr
|
||||
JOIN _tmp_xhs_cids t ON cpr.user_id = t.id
|
||||
WHERE cpr.play_status = 1 AND cpr.deleted_at IS NULL
|
||||
""")
|
||||
for uid, ch_id, cat in cur.fetchall():
|
||||
char_plays[uid].append((ch_id, cat))
|
||||
except Exception as e:
|
||||
pass
|
||||
log(f" 有记录角色: {len(char_plays)}")
|
||||
|
||||
# Step 5: U00 chapter map
|
||||
log("Step 5: 课程映射")
|
||||
cur.execute("""
|
||||
SELECT id, course_level, course_season, course_unit, course_lesson
|
||||
FROM bi_level_unit_lesson WHERE course_unit LIKE '%U00%'
|
||||
""")
|
||||
u00_map = {r[0]: (r[1], r[2], r[3], r[4]) for r in cur.fetchall()}
|
||||
|
||||
# Step 6: Study time
|
||||
log("Step 6: 学习耗时")
|
||||
char_total_ms = defaultdict(int)
|
||||
for tbl_idx in range(8):
|
||||
table = f"bi_user_component_play_record_{tbl_idx}"
|
||||
try:
|
||||
cur.execute(f"""
|
||||
SELECT comp.user_id, SUM(COALESCE(comp.interval_time, 0))
|
||||
FROM {table} comp
|
||||
JOIN _tmp_xhs_cids t ON comp.user_id = t.id
|
||||
WHERE comp.deleted_at IS NULL
|
||||
GROUP BY comp.user_id
|
||||
""")
|
||||
for uid, tms in cur.fetchall():
|
||||
char_total_ms[uid] += (tms or 0)
|
||||
except:
|
||||
pass
|
||||
|
||||
# Step 7: Orders & refunds
|
||||
log("Step 7: 订单与退费")
|
||||
cur.execute("""
|
||||
SELECT o.account_id, o.pay_success_date, o.pay_amount_int, o.order_status,
|
||||
CASE WHEN r.status = 3 AND o.order_status = 4 THEN 1 ELSE 0 END as is_refunded
|
||||
FROM bi_vala_order o
|
||||
LEFT JOIN bi_refund_order r ON o.trade_no = r.trade_no
|
||||
WHERE o.account_id IN (SELECT id FROM _tmp_xhs_aids) AND o.deleted_at IS NULL
|
||||
""")
|
||||
orders = defaultdict(list)
|
||||
for aid, psd, pai, os, is_ref in cur.fetchall():
|
||||
orders[aid].append({'pay_date': psd, 'amount': pai, 'status': os, 'refunded': is_ref})
|
||||
|
||||
# Step 8: Course details
|
||||
log("Step 8: 课程分配")
|
||||
cur.execute("""
|
||||
SELECT account_id, course_level,
|
||||
CASE WHEN expire_time IS NULL THEN '体验课' ELSE '正式课' END
|
||||
FROM bi_user_course_detail
|
||||
WHERE account_id IN (SELECT id FROM _tmp_xhs_aids) AND deleted_at IS NULL
|
||||
""")
|
||||
course_info = {}
|
||||
for aid, cl, ct in cur.fetchall():
|
||||
if aid not in course_info:
|
||||
course_info[aid] = (cl, ct)
|
||||
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# Step 9: Assemble
|
||||
log("Step 9: 组装数据")
|
||||
results = []
|
||||
for aid in sorted(aids):
|
||||
acc = accounts.get(aid)
|
||||
if not acc:
|
||||
continue
|
||||
|
||||
chars = account_chars.get(aid, [])
|
||||
first_lesson = None
|
||||
first_chapter = None
|
||||
u00_completed = set()
|
||||
recent_lesson = None
|
||||
recent_chapter = None
|
||||
|
||||
for cid in chars:
|
||||
for ch_id, cat in char_plays.get(cid, []):
|
||||
if ch_id in u00_map:
|
||||
cl, cs, cu, cl2 = u00_map[ch_id]
|
||||
u00_completed.add(cl2)
|
||||
if first_lesson is None or cat < first_lesson:
|
||||
first_lesson = cat
|
||||
first_chapter = (cl, cs, cu, cl2)
|
||||
elif first_lesson is None:
|
||||
first_lesson = cat
|
||||
if recent_lesson is None or cat > recent_lesson:
|
||||
recent_lesson = cat
|
||||
if ch_id in u00_map:
|
||||
recent_chapter = u00_map[ch_id]
|
||||
|
||||
u00_times = {}
|
||||
for cid in chars:
|
||||
for ch_id, cat in char_plays.get(cid, []):
|
||||
if ch_id in u00_map:
|
||||
cl2 = u00_map[ch_id][3]
|
||||
if cl2 not in u00_times or cat < u00_times[cl2]:
|
||||
u00_times[cl2] = cat
|
||||
|
||||
total_min = round(sum(char_total_ms.get(cid, 0) for cid in chars) / 60000, 1)
|
||||
|
||||
user_orders = orders.get(aid, [])
|
||||
has_paid = any(o['pay_date'] and o['status'] in (3, 4) for o in user_orders)
|
||||
refunded_count = sum(1 for o in user_orders if o['refunded'])
|
||||
all_refunded = refunded_count > 0 and all(o['refunded'] or not o['pay_date'] or o['status'] not in (3, 4) for o in user_orders)
|
||||
|
||||
pay_status = '已付费'
|
||||
if not has_paid:
|
||||
pay_status = '未付费'
|
||||
elif all_refunded:
|
||||
pay_status = '已退款'
|
||||
elif refunded_count > 0:
|
||||
pay_status = '已付费(部分退费)'
|
||||
|
||||
first_pay = None
|
||||
total_gmv = 0
|
||||
total_refund = 0
|
||||
for o in user_orders:
|
||||
if o['pay_date']:
|
||||
if first_pay is None or o['pay_date'] < first_pay:
|
||||
first_pay = o['pay_date']
|
||||
amt = o['amount'] / 100
|
||||
total_gmv += amt
|
||||
if o['refunded']:
|
||||
total_refund += amt
|
||||
|
||||
ci = course_info.get(aid)
|
||||
nicknames = ' / '.join(account_nicknames.get(aid, []))
|
||||
meta = uid_meta.get(aid, {})
|
||||
|
||||
results.append([
|
||||
str(aid), # A
|
||||
acc['tel'] if acc else '', # B
|
||||
meta.get('their_phone', ''), # C
|
||||
meta.get('sales', ''), # D
|
||||
meta.get('lead_date', ''), # E
|
||||
fmt(acc['reg_time'] if acc else None), # F
|
||||
acc['channel'] if acc else '', # G
|
||||
nicknames, # H
|
||||
ci[0] if ci else '', # I
|
||||
ci[1] if ci else '', # J
|
||||
fmt(first_lesson), # K
|
||||
f"{first_chapter[0]}-{first_chapter[1]}-{first_chapter[2]}-{first_chapter[3]}" if first_chapter else '',
|
||||
# L
|
||||
fmt(u00_times.get('L01')), # M
|
||||
fmt(u00_times.get('L02')), # N
|
||||
fmt(u00_times.get('L03')), # O
|
||||
fmt(u00_times.get('L04')), # P
|
||||
fmt(u00_times.get('L05')), # Q
|
||||
str(len(u00_completed)), # R
|
||||
fmt(recent_lesson), # S
|
||||
f"{recent_chapter[0]}-{recent_chapter[1]}-{recent_chapter[2]}-{recent_chapter[3]}" if recent_chapter else '',
|
||||
# T
|
||||
str(total_min), # U
|
||||
pay_status, # V
|
||||
fmt(first_pay), # W
|
||||
str(round(total_gmv, 2)), # X
|
||||
str(round(total_gmv - total_refund, 2)), # Y
|
||||
' / '.join(pay_channels) if pay_channels else '', # Z
|
||||
fmt(latest_login.get(aid)), # AA
|
||||
])
|
||||
|
||||
# Step 10: Write to sheet
|
||||
log(f"Step 10: 写入表格 ({len(results)} 行)")
|
||||
token = get_fs_token()
|
||||
|
||||
values = [HEADERS] + results
|
||||
total_rows = len(values)
|
||||
range_str = f"A1:AA{total_rows}"
|
||||
|
||||
resp = requests.put(
|
||||
f'https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{SPREADSHEET_TOKEN}/values',
|
||||
headers={'Authorization': f'Bearer {token}', 'Content-Type': 'application/json'},
|
||||
json={'valueRange': {'range': f'{SHEET_ID}!{range_str}', 'values': values}},
|
||||
timeout=120
|
||||
)
|
||||
r = resp.json()
|
||||
if r.get('code') == 0:
|
||||
log(f"✅ 刷新完成: {len(results)} 行, {len(HEADERS)} 列")
|
||||
else:
|
||||
log(f"❌ 写入失败: {r}")
|
||||
return 1
|
||||
|
||||
# Step 11: Log summary
|
||||
paid_count = sum(1 for r in results if r[18] not in ('未付费',))
|
||||
first_lesson_count = sum(1 for r in results if r[7])
|
||||
log(f"摘要: 总{len(results)} | 有首课{first_lesson_count} | 付费{paid_count}")
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
log(f"ERROR: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@ -27,9 +27,9 @@ FEISHU_APP_ID = "cli_a929ae22e0b8dcc8"
|
||||
FEISHU_APP_SECRET = "OtFjMy7p3qE3VvLbMdcWidwgHOnGD4FJ"
|
||||
SPREADSHEET_TOKEN = "TWeHsCX6JhhsjntBagHcoJcJned"
|
||||
|
||||
SHEET_MINGXI = "2dBSrS" # 明细数据
|
||||
SHEET_DATE_DEDUP = "1X5g5y" # 按日期去重
|
||||
SHEET_SRC_DEDUP = "1X5g5z" # 按source_l3去重
|
||||
SHEET_MINGXI = "1uDxsI" # 明细数据
|
||||
SHEET_DATE_DEDUP = "1uxp3G" # 按日期去重
|
||||
SHEET_SRC_DEDUP = "1urKus" # 按source_l3去重
|
||||
|
||||
TZ = timezone(timedelta(hours=8))
|
||||
|
||||
@ -54,24 +54,29 @@ def get_feishu_token():
|
||||
raise
|
||||
|
||||
def query_data():
|
||||
"""SELECT * 动态获取全部字段,表增删字段自动适配"""
|
||||
"""查询 stride_phone_lead + stride_contact_bindings,返回 rows/cols/加微tel集合"""
|
||||
import psycopg2
|
||||
conn = psycopg2.connect(
|
||||
host=PG_HOST, port=PG_PORT, user=PG_USER,
|
||||
password=PG_PASSWORD, dbname=PG_DB
|
||||
)
|
||||
cur = conn.cursor()
|
||||
# 主表
|
||||
cur.execute("SELECT * FROM stride_phone_lead ORDER BY id")
|
||||
rows = cur.fetchall()
|
||||
cols = [desc[0] for desc in cur.description]
|
||||
# 加微绑定表 — 收集已加微的 tel_encrypt
|
||||
cur.execute("SELECT DISTINCT tel_encrypt FROM stride_contact_bindings WHERE tel_encrypt IS NOT NULL")
|
||||
wechat_tels = {r[0] for r in cur.fetchall()}
|
||||
cur.close()
|
||||
conn.close()
|
||||
return rows, cols
|
||||
return rows, cols, wechat_tels
|
||||
|
||||
def build_sheet1_data(rows, cols):
|
||||
"""明细数据 — 列名、列数完全跟随数据库表结构,无需手动维护"""
|
||||
def build_sheet1_data(rows, cols, wechat_tels):
|
||||
"""明细数据 — 列名、列数完全跟随数据库表结构,末尾追加「是否加微」"""
|
||||
ts_cols = {"created_at", "updated_at"}
|
||||
data = [list(cols)]
|
||||
idx_phone_enc = cols.index("phone_encrypt") if "phone_encrypt" in cols else -1
|
||||
data = [list(cols) + ["是否加微"]]
|
||||
for r in rows:
|
||||
row_data = []
|
||||
for i, col in enumerate(cols):
|
||||
@ -82,34 +87,58 @@ def build_sheet1_data(rows, cols):
|
||||
row_data.append(datetime.fromtimestamp(val, tz=TZ).strftime("%Y-%m-%d %H:%M:%S"))
|
||||
else:
|
||||
row_data.append(str(val))
|
||||
# 是否加微
|
||||
phone_enc = r[idx_phone_enc] if idx_phone_enc >= 0 else None
|
||||
row_data.append("是" if phone_enc in wechat_tels else "否")
|
||||
data.append(row_data)
|
||||
return data
|
||||
|
||||
def build_sheet2_data(rows, cols):
|
||||
"""按日期去重 account_id"""
|
||||
def build_sheet2_data(rows, cols, wechat_tels):
|
||||
"""按日期去重 account_id + 已加微数(补全日期范围,无数据填0)"""
|
||||
date_accts = defaultdict(set)
|
||||
date_wechat = defaultdict(set)
|
||||
idx_created = cols.index("created_at")
|
||||
idx_acc = cols.index("account_id")
|
||||
idx_phone_enc = cols.index("phone_encrypt")
|
||||
for r in rows:
|
||||
if r[idx_created]:
|
||||
dt = datetime.fromtimestamp(r[idx_created], tz=TZ).strftime("%Y-%m-%d")
|
||||
date_accts[dt].add(r[idx_acc])
|
||||
data = [["日期", "去重account_id数"]]
|
||||
for date in sorted(date_accts.keys()):
|
||||
data.append([date, len(date_accts[date])])
|
||||
acct = r[idx_acc]
|
||||
date_accts[dt].add(acct)
|
||||
if r[idx_phone_enc] in wechat_tels:
|
||||
date_wechat[dt].add(acct)
|
||||
# 确定日期范围
|
||||
all_dates = sorted(date_accts.keys())
|
||||
if all_dates:
|
||||
start = datetime.strptime(all_dates[0], "%Y-%m-%d").date()
|
||||
end = datetime.strptime(all_dates[-1], "%Y-%m-%d").date()
|
||||
date_range = []
|
||||
d = start
|
||||
while d <= end:
|
||||
date_range.append(d.strftime("%Y-%m-%d"))
|
||||
d += timedelta(days=1)
|
||||
else:
|
||||
date_range = []
|
||||
data = [["日期", "去重account_id数", "已加微"]]
|
||||
for date in date_range:
|
||||
data.append([date, len(date_accts[date]), len(date_wechat[date])])
|
||||
return data
|
||||
|
||||
def build_sheet3_data(rows, cols):
|
||||
"""按source_l3去重 account_id"""
|
||||
def build_sheet3_data(rows, cols, wechat_tels):
|
||||
"""按source_l3去重 account_id + 已加微数"""
|
||||
src_accts = defaultdict(set)
|
||||
src_wechat = defaultdict(set)
|
||||
idx_src = cols.index("source_l3")
|
||||
idx_acc = cols.index("account_id")
|
||||
idx_phone_enc = cols.index("phone_encrypt")
|
||||
for r in rows:
|
||||
src = r[idx_src] or "(空)"
|
||||
src_accts[src].add(r[idx_acc])
|
||||
data = [["source_l3", "去重account_id数"]]
|
||||
if r[idx_phone_enc] in wechat_tels:
|
||||
src_wechat[src].add(r[idx_acc])
|
||||
data = [["source_l3", "去重account_id数", "已加微"]]
|
||||
for src in sorted(src_accts.keys()):
|
||||
data.append([src, len(src_accts[src])])
|
||||
data.append([src, len(src_accts[src]), len(src_wechat[src])])
|
||||
return data
|
||||
|
||||
def clear_and_write(token, sheet_id, data, sheet_name):
|
||||
@ -155,8 +184,9 @@ def clear_and_write(token, sheet_id, data, sheet_name):
|
||||
def main():
|
||||
# 查询数据
|
||||
log("开始查询数据库...")
|
||||
rows, cols = query_data()
|
||||
rows, cols, wechat_tels = query_data()
|
||||
log(f"查询完成,共 {len(rows)} 条记录,{len(cols)} 个字段:{cols}")
|
||||
log(f"加微绑定 tel_encrypt 共 {len(wechat_tels)} 个")
|
||||
|
||||
# 获取飞书token
|
||||
log("获取飞书token...")
|
||||
@ -164,17 +194,17 @@ def main():
|
||||
|
||||
# 写入Sheet1
|
||||
log("更新Sheet1 明细数据...")
|
||||
sheet1 = build_sheet1_data(rows, cols)
|
||||
sheet1 = build_sheet1_data(rows, cols, wechat_tels)
|
||||
clear_and_write(token, SHEET_MINGXI, sheet1, "明细数据")
|
||||
|
||||
# 写入Sheet2
|
||||
log("更新Sheet2 按日期去重...")
|
||||
sheet2 = build_sheet2_data(rows, cols)
|
||||
sheet2 = build_sheet2_data(rows, cols, wechat_tels)
|
||||
clear_and_write(token, SHEET_DATE_DEDUP, sheet2, "按日期去重")
|
||||
|
||||
# 写入Sheet3
|
||||
log("更新Sheet3 按source_l3去重...")
|
||||
sheet3 = build_sheet3_data(rows, cols)
|
||||
sheet3 = build_sheet3_data(rows, cols, wechat_tels)
|
||||
clear_and_write(token, SHEET_SRC_DEDUP, sheet3, "按source_l3去重")
|
||||
|
||||
log("全部更新完成!")
|
||||
|
||||
Loading…
Reference in New Issue
Block a user