🤖 每日自动备份 - 2026-03-25 08:00:01
This commit is contained in:
parent
e219736a74
commit
9a3840ed43
40
MEMORY.md
40
MEMORY.md
@ -41,7 +41,17 @@
|
||||
- **连接信息已安全存储在 TOOLS.md**
|
||||
- **核心业务表位置:**
|
||||
- 订单表 `bi_vala_order`:线上PostgreSQL数据库 `vala_bi` 库,默认无特殊说明时查询此线上库数据
|
||||
- 字段说明:`key_from` 代表销售渠道,可用于按渠道维度统计订单、GMV等指标**
|
||||
- 字段说明:`key_from` 代表销售渠道,可用于按渠道维度统计订单、GMV等指标
|
||||
- 用户账户表 `bi_vala_app_account`:线上PostgreSQL数据库 `vala_bi` 库
|
||||
- 字段说明:`download_channel` 代表用户的下载渠道,用于统计新增用户的来源平台
|
||||
- 匹配规则:`download_channel`字段为汉字格式,采用「关键字包含」的匹配方式,例如学而思渠道对应`download_channel LIKE '%学而思%'`
|
||||
- 用户课程明细表 `bi_user_course_detail`:线上PostgreSQL数据库 `vala_bi` 库
|
||||
- 字段说明:
|
||||
- `account_id`:账号id/用户id
|
||||
- `user_id`:角色id
|
||||
- `course_level`:课程等级映射:A1 = L1,A2 = L2
|
||||
- `deleted_at`:课程删除时间,字段为空代表课程未被删除,有值代表课程已被删除
|
||||
- `expire_time`:课程过期时间,字段不为空代表是正式课,为空代表是体验课
|
||||
|
||||
## Business Knowledge Base
|
||||
- **已收集13个常用SQL查询模板**
|
||||
@ -66,13 +76,39 @@
|
||||
- GSV:GMV 减去符合条件的订单中已完成退费的金额总和(单位:元)
|
||||
- 退费率:符合条件的订单中已完成退费的订单数 / 订单总数量 * 100%,保留1位小数
|
||||
- **渠道映射规则(key_from字段匹配):**
|
||||
- 端内:`app-active-h5-0-0`
|
||||
- 端内购买:`app-active-h5-0-0` 或 `app-sales-bj-qhm-0`(两个值匹配任意一个即属于端内购买)
|
||||
- 端外购买:除上述两个端内匹配值之外的所有`key_from`值均属于端外购买
|
||||
- 端外销售渠道购买:端外购买中`key_from`以`sales-adp`开头的为销售渠道购买
|
||||
- 小红书店铺:`newmedia-dianpu-xhs-0-0`
|
||||
- 达人直播:`newmedia-daren%`(前缀匹配)
|
||||
- 万物:`newmedia-dianpu-wwxx-0-0`
|
||||
- **sale_channel字段映射规则(仅对`key_from = app-active-h5-0-0`的订单生效):**
|
||||
| sale_channel值 | 对应渠道名称 |
|
||||
|---------------|--------------|
|
||||
| 11 | 苹果 |
|
||||
| 12 | 华为 |
|
||||
| 13 | 小米 |
|
||||
| 14 | 荣耀 |
|
||||
| 15 | 应用宝 |
|
||||
| 17 | 魅族 |
|
||||
| 18 | VIVO |
|
||||
| 19 | OPPO |
|
||||
| 21 | 学而思 |
|
||||
| 22 | 讯飞 |
|
||||
| 23 | 步步高 |
|
||||
| 24 | 作业帮 |
|
||||
| 25 | 小度 |
|
||||
| 26 | 希沃 |
|
||||
| 27 | 京东方 |
|
||||
| 41 | 官网 |
|
||||
| 71 | 小程序 |
|
||||
| 其他值 | 站外 |
|
||||
- **金额单位规则:** `bi_vala_order`表中`pay_amount`字段以元为单位,`pay_amount_int`字段以分为单位;后续统一使用`pay_amount_int`计算销售金额,统计为元时除以100即可
|
||||
- **学习数据统计维度:** 支持按单元/课时/组件维度统计完成人数、平均用时、正确率(Perfect/Good/Oops三个等级)
|
||||
- **特殊时间节点:** `2025-10-01`为核心版本上线时间,部分统计需要区分该节点前后的数据
|
||||
- **用户统计口径区分规则:**
|
||||
- 新增用户(免费注册新增):使用`bi_vala_app_account.download_channel`字段进行分渠道统计
|
||||
- 新增付费用户:使用`bi_vala_order.sale_channel`(端内`key_from = app-active-h5-0-0`订单)或`bi_vala_order.key_from`字段进行分渠道统计
|
||||
- **学习数据计算逻辑:**
|
||||
- **课时首次完成时间计算逻辑:**
|
||||
1. 关联路径:用户ID(bi_vala_app_account.id)→ 角色ID(bi_vala_app_character.id)→ bi_user_chapter_play_record_{分表号}.user_id
|
||||
|
||||
99
generate_report.py
Normal file
99
generate_report.py
Normal file
@ -0,0 +1,99 @@
|
||||
import pandas as pd
|
||||
import psycopg2
|
||||
|
||||
# 1. 读取最新的带成交标记的订单数据
|
||||
order_df = pd.read_csv('2026年3月1日至今订单_含正确成交标记.csv')
|
||||
print(f"订单总数:{len(order_df)}")
|
||||
|
||||
# 2. 计算GMV和退款相关
|
||||
order_df['GMV'] = order_df['pay_amount_int'] / 100
|
||||
order_df['is_refund'] = (order_df['order_status'] == 4).astype(int)
|
||||
# 计算GSV:退款订单GSV为0,其他为GMV
|
||||
order_df['GSV'] = order_df.apply(lambda row: 0 if row['order_status'] == 4 else row['GMV'], axis=1)
|
||||
order_df['refund_amount'] = order_df.apply(lambda row: row['GMV'] if row['order_status'] == 4 else 0, axis=1)
|
||||
|
||||
# 3. 映射到大类渠道
|
||||
def map_channel(tag):
|
||||
if tag in ['销转', '销转-小龙']:
|
||||
return '销转'
|
||||
elif tag in ['端内直购', '端内销转']:
|
||||
return 'App转化'
|
||||
elif tag == '达播':
|
||||
return '达播'
|
||||
elif tag.startswith('班主任-'):
|
||||
return '班主任'
|
||||
elif tag == '店铺直购':
|
||||
return '店铺直购'
|
||||
else:
|
||||
return '其他'
|
||||
|
||||
order_df['渠道大类'] = order_df['成交标记'].apply(map_channel)
|
||||
|
||||
# 4. 按大类统计
|
||||
channel_stats = order_df.groupby('渠道大类').agg(
|
||||
订单数=('id', 'count'),
|
||||
GMV=('GMV', 'sum'),
|
||||
已退款金额=('refund_amount', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退款订单数=('is_refund', 'sum'),
|
||||
客单价=('GMV', 'mean')
|
||||
).reset_index()
|
||||
channel_stats['退费率'] = (channel_stats['退款订单数'] / channel_stats['订单数'] * 100).round(1).astype(str) + '%'
|
||||
channel_stats['GMV'] = channel_stats['GMV'].round(2)
|
||||
channel_stats['GSV'] = channel_stats['GSV'].round(2)
|
||||
channel_stats['已退款金额'] = channel_stats['已退款金额'].round(2)
|
||||
channel_stats['客单价'] = channel_stats['客单价'].round(2)
|
||||
|
||||
# 5. 原预测表的预测值
|
||||
pred_data = [
|
||||
{'渠道大类': '销转', '预测GSV': 100000},
|
||||
{'渠道大类': 'App转化', '预测GSV': 20000},
|
||||
{'渠道大类': '达播', '预测GSV': 250000},
|
||||
{'渠道大类': '班主任', '预测GSV': 10000}
|
||||
]
|
||||
pred_df = pd.DataFrame(pred_data)
|
||||
|
||||
# 6. 合并实际和预测数据
|
||||
report_df = pd.merge(pred_df, channel_stats, on='渠道大类', how='left')
|
||||
# 加上店铺直购的统计
|
||||
shop_stats = channel_stats[channel_stats['渠道大类'] == '店铺直购']
|
||||
report_df = pd.concat([report_df, shop_stats], ignore_index=True)
|
||||
# 加上总计
|
||||
total = pd.DataFrame({
|
||||
'渠道大类': ['总计'],
|
||||
'预测GSV': [pred_df['预测GSV'].sum()],
|
||||
'订单数': [channel_stats['订单数'].sum()],
|
||||
'GMV': [channel_stats['GMV'].sum()],
|
||||
'已退款金额': [channel_stats['已退款金额'].sum()],
|
||||
'GSV': [channel_stats['GSV'].sum()],
|
||||
'退款订单数': [channel_stats['退款订单数'].sum()],
|
||||
'客单价': [channel_stats['GMV'].sum()/channel_stats['订单数'].sum()],
|
||||
'退费率': [str((channel_stats['退款订单数'].sum()/channel_stats['订单数'].sum()*100).round(1)) + '%']
|
||||
})
|
||||
report_df = pd.concat([report_df, total], ignore_index=True)
|
||||
report_df['完成率'] = report_df.apply(lambda row: str(round(row['GSV']/row['预测GSV']*100, 1)) + '%' if pd.notna(row['预测GSV']) else '-', axis=1)
|
||||
|
||||
# 7. 保存报表
|
||||
output_file = '2026年3月收入预测报表_最新版.xlsx'
|
||||
with pd.ExcelWriter(output_file) as writer:
|
||||
report_df.to_excel(writer, sheet_name='整体统计', index=False)
|
||||
# 达播分达人明细
|
||||
dabo_df = order_df[order_df['渠道大类'] == '达播'].groupby('key_from').agg(
|
||||
订单数=('id', 'count'),
|
||||
GMV=('GMV', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退费率=('is_refund', lambda x: str((x.sum()/x.count()*100).round(1)) + '%')
|
||||
).reset_index()
|
||||
dabo_df.to_excel(writer, sheet_name='达播达人明细', index=False)
|
||||
# 成交标记明细
|
||||
tag_df = order_df.groupby('成交标记').agg(
|
||||
订单数=('id', 'count'),
|
||||
GMV=('GMV', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退费率=('is_refund', lambda x: str((x.sum()/x.count()*100).round(1)) + '%')
|
||||
).reset_index()
|
||||
tag_df.to_excel(writer, sheet_name='成交标记明细', index=False)
|
||||
|
||||
print(f"\n最新3月收入预测报表已生成:{output_file}")
|
||||
print("\n整体统计结果:")
|
||||
print(report_df[['渠道大类', '预测GSV', 'GSV', '完成率', '订单数', 'GMV', '退费率']])
|
||||
68
process_order.py
Normal file
68
process_order.py
Normal file
@ -0,0 +1,68 @@
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
|
||||
# 读取表A(用户提供的参考表)
|
||||
table_a = pd.read_excel('reference_order.xlsx')
|
||||
# 重命名列方便匹配
|
||||
table_a = table_a.rename(columns={'订单号': 'out_trade_no', 'keyFrom': 'key_from_a'})
|
||||
# 只保留需要的字段
|
||||
table_a = table_a[['out_trade_no', 'key_from_a', '成交标记']]
|
||||
print(f"表A总订单数:{len(table_a)},其中成交标记非空:{len(table_a[table_a['成交标记'].notna()])}")
|
||||
|
||||
# 读取表B(导出的3月1日至今订单)
|
||||
table_b = pd.read_csv('2026年3月1日至今订单.csv')
|
||||
print(f"表B总订单数:{len(table_b)}")
|
||||
|
||||
# 第一步:匹配重复订单(两个表都有的订单)
|
||||
merged = pd.merge(table_b, table_a, on='out_trade_no', how='left', indicator=True)
|
||||
# 统计匹配情况
|
||||
match_stats = merged['_merge'].value_counts()
|
||||
print(f"\n匹配结果:")
|
||||
print(f" 两个表都有的订单:{match_stats.get('both', 0)}条 → 直接使用表A的成交标记")
|
||||
print(f" 仅表B存在的新增订单:{match_stats.get('left_only', 0)}条 → 按规则生成新标记")
|
||||
|
||||
# 第二步:处理新增订单的标记逻辑
|
||||
# 先从已匹配的订单中学习key_from到成交标记的映射
|
||||
learned_map = merged[merged['_merge'] == 'both'].drop_duplicates('key_from')[['key_from', '成交标记']].set_index('key_from')['成交标记'].to_dict()
|
||||
print(f"\n从匹配的订单中学习到的key_from→成交标记映射(共{len(learned_map)}条):")
|
||||
for k, v in learned_map.items():
|
||||
if pd.notna(v):
|
||||
print(f" {k} → {v}")
|
||||
|
||||
# 定义标记生成规则
|
||||
def get_final_tag(row):
|
||||
# 如果是匹配到的订单,直接用表A的标记
|
||||
if row['_merge'] == 'both' and pd.notna(row['成交标记']):
|
||||
return row['成交标记']
|
||||
# 新增订单优先用学习到的映射
|
||||
key_from = row['key_from']
|
||||
if key_from in learned_map and pd.notna(learned_map[key_from]):
|
||||
return learned_map[key_from]
|
||||
# 规则匹配
|
||||
if key_from.startswith('newmedia-daren-'):
|
||||
return '达播'
|
||||
elif key_from == 'app-active-h5-0-0':
|
||||
return '端内直购'
|
||||
elif key_from.startswith('sales-adp-') or key_from.startswith('app-sales-'):
|
||||
return '销转'
|
||||
elif key_from.startswith('newmedia-dianpu-'):
|
||||
return '店铺直购'
|
||||
else:
|
||||
return '其他'
|
||||
|
||||
# 生成最终成交标记
|
||||
merged['最终成交标记'] = merged.apply(get_final_tag, axis=1)
|
||||
# 将标记为0的修改为店铺直购
|
||||
merged['最终成交标记'] = merged['最终成交标记'].replace(0, '店铺直购')
|
||||
merged['最终成交标记'] = merged['最终成交标记'].replace('0', '店铺直购')
|
||||
|
||||
# 清理不需要的字段
|
||||
final_df = merged.drop(columns=['key_from_a', '_merge', '成交标记']).rename(columns={'最终成交标记': '成交标记'})
|
||||
|
||||
# 保存结果
|
||||
output_file = '2026年3月1日至今订单_含正确成交标记.csv'
|
||||
final_df.to_csv(output_file, index=False, encoding='utf-8-sig')
|
||||
print(f"\n处理完成,已生成最终文件:{output_file}")
|
||||
print(f"最终成交标记分布:")
|
||||
print(final_df['成交标记'].value_counts())
|
||||
|
||||
204
regenerate_report.py
Normal file
204
regenerate_report.py
Normal file
@ -0,0 +1,204 @@
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
import psycopg2
|
||||
|
||||
# 1. 正确计算GSV:同时满足bi_refund_order.status=3 和 bi_vala_order.order_status=4
|
||||
conn = psycopg2.connect(
|
||||
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
|
||||
port=28591,
|
||||
user="ai_member",
|
||||
password="LdfjdjL83h3h3^$&**YGG*",
|
||||
database="vala_bi"
|
||||
)
|
||||
|
||||
# 获取退款数据
|
||||
cur = conn.cursor()
|
||||
cur.execute("""
|
||||
SELECT out_trade_no, SUM(refund_amount_int) as total_refund_int
|
||||
FROM bi_refund_order
|
||||
WHERE status = 3 AND created_at >= '2026-03-01 00:00:00+08'
|
||||
GROUP BY out_trade_no
|
||||
""")
|
||||
refund_data = cur.fetchall()
|
||||
refund_df = pd.DataFrame(refund_data, columns=['out_trade_no', 'total_refund_int'])
|
||||
cur.close()
|
||||
conn.close()
|
||||
|
||||
# 获取订单数据
|
||||
order_df = pd.read_csv('2026年3月1日至今订单_含正确成交标记.csv')
|
||||
|
||||
# 合并计算
|
||||
order_df = pd.merge(order_df, refund_df, on='out_trade_no', how='left')
|
||||
order_df['total_refund_int'] = order_df['total_refund_int'].fillna(0)
|
||||
order_df['GMV'] = order_df['pay_amount_int'] / 100
|
||||
order_df['refund_amount'] = order_df.apply(
|
||||
lambda row: row['total_refund_int']/100 if row['order_status'] == 4 else 0,
|
||||
axis=1
|
||||
)
|
||||
order_df['GSV'] = order_df['GMV'] - order_df['refund_amount']
|
||||
order_df['is_valid_refund'] = (order_df['order_status'] == 4) & (order_df['total_refund_int'] > 0)
|
||||
|
||||
# 2. 渠道映射(和原表一致)
|
||||
def map_channel(tag):
|
||||
if tag in ['销转', '销转-小龙']:
|
||||
return '销转'
|
||||
elif tag in ['端内直购', '端内销转']:
|
||||
return 'App转化'
|
||||
elif tag == '达播':
|
||||
return '达播'
|
||||
elif tag.startswith('班主任-'):
|
||||
return '班主任'
|
||||
else:
|
||||
return '其他'
|
||||
|
||||
order_df['渠道大类'] = order_df['成交标记'].apply(map_channel)
|
||||
|
||||
# 3. 按原表格式构建报表
|
||||
# 原表表头结构
|
||||
report_data = [
|
||||
# 第一部分:3月剩余预测 & 3月实际 汇总
|
||||
['3月剩余预测', 'GMV', '', 'GSV', '', '', '3月实际', 'GMV', '', 'GSV', '', '完成率', ''],
|
||||
['销转', '', '', 100000, '', '', '', '', '', 0, '', '', ''],
|
||||
['App转化', '', '', 20000, '', '', '', '', '', 0, '', '', ''],
|
||||
['达播', '', '', 250000, '', '', '', '', '', 0, '', '', ''],
|
||||
['班主任', '', '', 10000, '', '', '', '', '', 0, '', '', ''],
|
||||
['', '', '', '', '', '', '', '', '', '', '', '', ''],
|
||||
# 销转明细
|
||||
['', '', '线索量', '线索成本', '转化率', '客单价', 'GMV', '退款率', 'GSV', '投放成本', '退后ROI', '', ''],
|
||||
['销转', '第一周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
|
||||
['', '第二周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
|
||||
['', '第三周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
|
||||
['', '第四周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
|
||||
['', '小计', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
|
||||
['', '', '', '', '', '', '', '', '', '', '', '', ''],
|
||||
# App转化明细
|
||||
['App转化', '', '注册人数', '转化率', '客单价', 'GMV', '退款率', 'GSV', '', '', '', '', ''],
|
||||
['', '自然转化', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
|
||||
['', '销售转化', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
|
||||
['', '小计', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
|
||||
['', '', '', '', '', '', '', '', '', '', '', '', ''],
|
||||
# 达播明细
|
||||
['达播', '', '达人', '订单量', '均单价', 'GMV', '退款率', 'GSV', '', '', '', '', ''],
|
||||
]
|
||||
|
||||
# 计算汇总数据
|
||||
channel_summary = order_df.groupby('渠道大类').agg(
|
||||
总订单数=('id', 'count'),
|
||||
总GMV=('GMV', 'sum'),
|
||||
总GSV=('GSV', 'sum'),
|
||||
退款订单数=('is_valid_refund', 'sum'),
|
||||
总退款金额=('refund_amount', 'sum')
|
||||
).reset_index()
|
||||
|
||||
# 填充汇总行
|
||||
channel_map = {'销转': 1, 'App转化': 2, '达播': 3, '班主任': 4}
|
||||
for _, row in channel_summary.iterrows():
|
||||
if row['渠道大类'] in channel_map:
|
||||
idx = channel_map[row['渠道大类']]
|
||||
report_data[idx][3] = 100000 if idx ==1 else 20000 if idx==2 else 250000 if idx==3 else 10000
|
||||
report_data[idx][8] = round(row['总GSV'], 2)
|
||||
report_data[idx][9] = round(row['总GMV'], 2)
|
||||
report_data[idx][10] = f"{round(row['总GSV']/report_data[idx][3]*100, 1)}%"
|
||||
report_data[idx][7] = round(row['总GMV'], 2)
|
||||
report_data[idx][11] = f"{round(row['退款订单数']/row['总订单数']*100,1)}%"
|
||||
|
||||
# 填充达播达人明细
|
||||
dabo_orders = order_df[order_df['渠道大类'] == '达播']
|
||||
dabo_summary = dabo_orders.groupby('key_from').agg(
|
||||
订单数=('id', 'count'),
|
||||
GMV=('GMV', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退款数=('is_valid_refund', 'sum')
|
||||
).reset_index()
|
||||
dabo_summary['退费率'] = (dabo_summary['退款数'] / dabo_summary['订单数'] * 100).round(1)
|
||||
dabo_summary['均单价'] = (dabo_summary['GMV'] / dabo_summary['订单数']).round(2)
|
||||
|
||||
# 匹配达人名称
|
||||
def get_daren_name(key):
|
||||
if '晚柠' in key:
|
||||
return '晚柠'
|
||||
elif '念妈' in key:
|
||||
return '念妈'
|
||||
elif '小花生' in key:
|
||||
return '小花生'
|
||||
elif '盈姐' in key:
|
||||
return '盈姐'
|
||||
elif '百克力' in key:
|
||||
return '百克力'
|
||||
elif '海淀妈妈优选' in key:
|
||||
return '海淀妈妈优选'
|
||||
elif '海淀小水妈' in key:
|
||||
return '海淀小水妈'
|
||||
else:
|
||||
return '其他达人'
|
||||
|
||||
dabo_summary['达人'] = dabo_summary['key_from'].apply(get_daren_name)
|
||||
dabo_final = dabo_summary.groupby('达人').agg(
|
||||
订单数=('订单数', 'sum'),
|
||||
GMV=('GMV', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退费率=('退费率', 'mean'),
|
||||
均单价=('均单价', 'mean')
|
||||
).reset_index()
|
||||
|
||||
for _, row in dabo_final.iterrows():
|
||||
report_data.append([
|
||||
'', '', row['达人'], row['订单数'], round(row['均单价'],2), round(row['GMV'],2), f"{row['退费率']}%", round(row['GSV'],2), '', '', '', '', ''
|
||||
])
|
||||
|
||||
# 添加达播小计
|
||||
dabo_total = dabo_final.sum()
|
||||
report_data.append([
|
||||
'', '', '小计', dabo_total['订单数'], round(dabo_total['GMV']/dabo_total['订单数'],2), round(dabo_total['GMV'],2),
|
||||
f"{round(dabo_orders['is_valid_refund'].sum()/len(dabo_orders)*100,1)}%", round(dabo_total['GSV'],2), '', '', '', '', ''
|
||||
])
|
||||
|
||||
# 班主任明细
|
||||
report_data.extend([
|
||||
['', '', '', '', '', '', '', '', '', '', '', '', ''],
|
||||
['班主任', '', '分类', '订单量', 'GMV', '退款订单', '退款金额', 'GSV', '', '', '', '', ''],
|
||||
['', '', '季转年', 0, 0, 0, 0, 0, '', '', '', '', ''],
|
||||
['', '', '年转年', 0, 0, 0, 0, 0, '', '', '', '', ''],
|
||||
['', '', '转介绍', 0, 0, 0, 0, 0, '', '', '', '', ''],
|
||||
['', '', '退费重报', 0, 0, 0, 0, 0, '', '', '', '', ''],
|
||||
])
|
||||
|
||||
banzhuren_orders = order_df[order_df['渠道大类'] == '班主任']
|
||||
bzr_summary = banzhuren_orders.groupby('成交标记').agg(
|
||||
订单数=('id', 'count'),
|
||||
GMV=('GMV', 'sum'),
|
||||
GSV=('GSV', 'sum'),
|
||||
退款数=('is_valid_refund', 'sum'),
|
||||
退款金额=('refund_amount', 'sum')
|
||||
).reset_index()
|
||||
|
||||
for _, row in bzr_summary.iterrows():
|
||||
if '年续' in row['成交标记'] or '年转年' in row['成交标记']:
|
||||
idx = -4
|
||||
elif '转介绍' in row['成交标记']:
|
||||
idx = -3
|
||||
elif '重报' in row['成交标记']:
|
||||
idx = -2
|
||||
else:
|
||||
idx = -5
|
||||
report_data[idx][3] = row['订单数']
|
||||
report_data[idx][4] = round(row['GMV'],2)
|
||||
report_data[idx][5] = row['退款数']
|
||||
report_data[idx][6] = round(row['退款金额'],2)
|
||||
report_data[idx][7] = round(row['GSV'],2)
|
||||
|
||||
# 班主任小计
|
||||
bzr_total = bzr_summary.sum()
|
||||
report_data.append([
|
||||
'', '', '小计', bzr_total['订单数'], round(bzr_total['GMV'],2), bzr_total['退款数'], round(bzr_total['退款金额'],2), round(bzr_total['GSV'],2), '', '', '', '', ''
|
||||
])
|
||||
|
||||
# 转换为DataFrame并保存
|
||||
df = pd.DataFrame(report_data)
|
||||
output_file = '2026年3月收入预测报表_与原表格式一致.xlsx'
|
||||
with pd.ExcelWriter(output_file, engine='openpyxl') as writer:
|
||||
df.to_excel(writer, index=False, header=False, sheet_name='3月收入报表')
|
||||
|
||||
print("报表已生成,格式与原表完全一致,GSV已按正确口径重新计算:")
|
||||
print(channel_summary[['渠道大类', '总GMV', '总GSV', '退款订单数']])
|
||||
print(f"\n总GSV:{round(order_df['GSV'].sum(),2)} 元,总GMV:{round(order_df['GMV'].sum(),2)} 元,整体退费率:{round(order_df['is_valid_refund'].sum()/len(order_df)*100,1)}%")
|
||||
51
scripts/xueersi_weekly_report.sh
Executable file
51
scripts/xueersi_weekly_report.sh
Executable file
@ -0,0 +1,51 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
# 配置信息
|
||||
PG_PASSWORD="LdfjdjL83h3h3^$&**YGG*"
|
||||
FEISHU_APP_ID="cli_a929ae22e0b8dcc8"
|
||||
FEISHU_APP_SECRET="OtFjMy7p3qE3VvLbMdcWidwgHOnGD4FJ"
|
||||
RECEIVE_OPEN_ID="ou_e63ce6b760ad39382852472f28fbe2a2"
|
||||
|
||||
# 计算时间范围:上周一到上周日
|
||||
START_DATE=$(date -d "last monday -7 days" +%Y-%m-%d)
|
||||
END_DATE=$(date -d "last sunday" +%Y-%m-%d)
|
||||
REPORT_DATE=$(date +%Y%m%d)
|
||||
CSV_PATH="/tmp/xueersi_weekly_data_${REPORT_DATE}.csv"
|
||||
EXCEL_PATH="/tmp/学而思渠道周度数据_${START_DATE//-/}-${END_DATE//-/}.xlsx"
|
||||
|
||||
# 1. 查询数据导出CSV
|
||||
PGPASSWORD="${PG_PASSWORD}" psql -h bj-postgres-16pob4sg.sql.tencentcdb.com -p 28591 -U ai_member -d vala_bi -c "\copy (WITH date_range AS (SELECT generate_series('${START_DATE}'::date, '${END_DATE}'::date, '1 day'::interval) AS stat_date), daily_new_users AS (SELECT DATE(created_at) AS stat_date, COUNT(DISTINCT id) AS new_user_count FROM bi_vala_app_account WHERE download_channel LIKE '%学而思%' AND created_at >= '${START_DATE} 00:00:00+08' AND created_at < '$(date -d "${END_DATE} +1 day" +%Y-%m-%d) 00:00:00+08' AND deleted_at IS NULL GROUP BY DATE(created_at)), daily_orders AS (SELECT DATE(o.pay_success_date) AS stat_date, COUNT(DISTINCT o.id) AS total_order_count, COUNT(DISTINCT CASE WHEN r.status = 3 AND o.order_status = 4 THEN o.id END) AS refund_order_count, ROUND(SUM(o.pay_amount_int)/100.0, 2) AS gmv, ROUND(SUM(o.pay_amount_int)/100.0 - COALESCE(SUM(CASE WHEN r.status = 3 AND o.order_status = 4 THEN r.refund_amount_int ELSE 0 END)/100.0, 0), 2) AS gsv FROM bi_vala_order o LEFT JOIN bi_refund_order r ON o.out_trade_no = r.out_trade_no WHERE o.key_from = 'app-active-h5-0-0' AND o.sale_channel = 21 AND o.pay_success_date >= '${START_DATE} 00:00:00+08' AND o.pay_success_date < '$(date -d "${END_DATE} +1 day" +%Y-%m-%d) 00:00:00+08' AND o.pay_success_date IS NOT NULL GROUP BY DATE(o.pay_success_date)), daily_data AS (SELECT TO_CHAR(d.stat_date, 'YYYY-MM-DD') AS 日期, COALESCE(u.new_user_count, 0) AS 新增用户数, COALESCE(o.total_order_count - o.refund_order_count, 0) AS 有效订单数, COALESCE(o.gsv, 0) AS GSV_元 FROM date_range d LEFT JOIN daily_new_users u ON d.stat_date = u.stat_date LEFT JOIN daily_orders o ON d.stat_date = o.stat_date) SELECT * FROM daily_data UNION ALL SELECT '合计' AS 日期, SUM(新增用户数) AS 新增用户数, SUM(有效订单数) AS 有效订单数, SUM(GSV_元) AS GSV_元 FROM daily_data ORDER BY 日期) TO '${CSV_PATH}' WITH (FORMAT csv, HEADER true, ENCODING 'UTF8');"
|
||||
|
||||
# 2. CSV转Excel
|
||||
python3 -c "import pandas as pd; df = pd.read_csv('${CSV_PATH}'); df.to_excel('${EXCEL_PATH}', index=False);"
|
||||
|
||||
# 3. 获取飞书租户token
|
||||
TOKEN_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"app_id\":\"${FEISHU_APP_ID}\",\"app_secret\":\"${FEISHU_APP_SECRET}\"}")
|
||||
TOKEN=$(echo "$TOKEN_RESP" | grep -o '"tenant_access_token":"[^"]*"' | cut -d'"' -f4)
|
||||
if [ -z "$TOKEN" ]; then echo "ERROR: 获取token失败"; exit 1; fi
|
||||
|
||||
# 4. 上传文件
|
||||
FILE_NAME=$(basename "${EXCEL_PATH}")
|
||||
UPLOAD_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/im/v1/files" \
|
||||
-H "Authorization: Bearer ${TOKEN}" \
|
||||
-F "file_type=xls" \
|
||||
-F "file_name=${FILE_NAME}" \
|
||||
-F "file=@${EXCEL_PATH}")
|
||||
FILE_KEY=$(echo "$UPLOAD_RESP" | grep -o '"file_key":"[^"]*"' | cut -d'"' -f4)
|
||||
if [ -z "$FILE_KEY" ]; then echo "ERROR: 上传文件失败"; exit 1; fi
|
||||
|
||||
# 5. 发送文件消息
|
||||
SEND_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/im/v1/messages?receive_id_type=open_id" \
|
||||
-H "Authorization: Bearer ${TOKEN}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"receive_id\":\"${RECEIVE_OPEN_ID}\",\"msg_type\":\"file\",\"content\":\"{\\\"file_key\\\":\\\"${FILE_KEY}\\\"}\"}")
|
||||
MSG_ID=$(echo "$SEND_RESP" | grep -o '"message_id":"[^"]*"' | cut -d'"' -f4)
|
||||
if [ -z "$MSG_ID" ]; then echo "ERROR: 发送消息失败"; exit 1; fi
|
||||
|
||||
# 清理临时文件
|
||||
rm -f "${CSV_PATH}" "${EXCEL_PATH}"
|
||||
|
||||
echo "学而思周度报表发送成功,日期范围:${START_DATE} 至 ${END_DATE}"
|
||||
Loading…
Reference in New Issue
Block a user