🤖 每日自动备份 - 2026-03-25 08:00:01

This commit is contained in:
小溪 2026-03-25 08:00:01 +08:00
parent e219736a74
commit 9a3840ed43
5 changed files with 460 additions and 2 deletions

View File

@ -41,7 +41,17 @@
- **连接信息已安全存储在 TOOLS.md** - **连接信息已安全存储在 TOOLS.md**
- **核心业务表位置:** - **核心业务表位置:**
- 订单表 `bi_vala_order`线上PostgreSQL数据库 `vala_bi` 库,默认无特殊说明时查询此线上库数据 - 订单表 `bi_vala_order`线上PostgreSQL数据库 `vala_bi` 库,默认无特殊说明时查询此线上库数据
- 字段说明:`key_from` 代表销售渠道可用于按渠道维度统计订单、GMV等指标** - 字段说明:`key_from` 代表销售渠道可用于按渠道维度统计订单、GMV等指标
- 用户账户表 `bi_vala_app_account`线上PostgreSQL数据库 `vala_bi`
- 字段说明:`download_channel` 代表用户的下载渠道,用于统计新增用户的来源平台
- 匹配规则:`download_channel`字段为汉字格式,采用「关键字包含」的匹配方式,例如学而思渠道对应`download_channel LIKE '%学而思%'`
- 用户课程明细表 `bi_user_course_detail`线上PostgreSQL数据库 `vala_bi`
- 字段说明:
- `account_id`账号id/用户id
- `user_id`角色id
- `course_level`课程等级映射A1 = L1A2 = L2
- `deleted_at`:课程删除时间,字段为空代表课程未被删除,有值代表课程已被删除
- `expire_time`:课程过期时间,字段不为空代表是正式课,为空代表是体验课
## Business Knowledge Base ## Business Knowledge Base
- **已收集13个常用SQL查询模板** - **已收集13个常用SQL查询模板**
@ -66,13 +76,39 @@
- GSVGMV 减去符合条件的订单中已完成退费的金额总和(单位:元) - GSVGMV 减去符合条件的订单中已完成退费的金额总和(单位:元)
- 退费率:符合条件的订单中已完成退费的订单数 / 订单总数量 * 100%保留1位小数 - 退费率:符合条件的订单中已完成退费的订单数 / 订单总数量 * 100%保留1位小数
- **渠道映射规则key_from字段匹配** - **渠道映射规则key_from字段匹配**
- 端内:`app-active-h5-0-0` - 端内购买:`app-active-h5-0-0` 或 `app-sales-bj-qhm-0`(两个值匹配任意一个即属于端内购买)
- 端外购买:除上述两个端内匹配值之外的所有`key_from`值均属于端外购买
- 端外销售渠道购买:端外购买中`key_from`以`sales-adp`开头的为销售渠道购买
- 小红书店铺:`newmedia-dianpu-xhs-0-0` - 小红书店铺:`newmedia-dianpu-xhs-0-0`
- 达人直播:`newmedia-daren%`(前缀匹配) - 达人直播:`newmedia-daren%`(前缀匹配)
- 万物:`newmedia-dianpu-wwxx-0-0` - 万物:`newmedia-dianpu-wwxx-0-0`
- **sale_channel字段映射规则仅对`key_from = app-active-h5-0-0`的订单生效):**
| sale_channel值 | 对应渠道名称 |
|---------------|--------------|
| 11 | 苹果 |
| 12 | 华为 |
| 13 | 小米 |
| 14 | 荣耀 |
| 15 | 应用宝 |
| 17 | 魅族 |
| 18 | VIVO |
| 19 | OPPO |
| 21 | 学而思 |
| 22 | 讯飞 |
| 23 | 步步高 |
| 24 | 作业帮 |
| 25 | 小度 |
| 26 | 希沃 |
| 27 | 京东方 |
| 41 | 官网 |
| 71 | 小程序 |
| 其他值 | 站外 |
- **金额单位规则:** `bi_vala_order`表中`pay_amount`字段以元为单位,`pay_amount_int`字段以分为单位;后续统一使用`pay_amount_int`计算销售金额统计为元时除以100即可 - **金额单位规则:** `bi_vala_order`表中`pay_amount`字段以元为单位,`pay_amount_int`字段以分为单位;后续统一使用`pay_amount_int`计算销售金额统计为元时除以100即可
- **学习数据统计维度:** 支持按单元/课时/组件维度统计完成人数、平均用时、正确率Perfect/Good/Oops三个等级 - **学习数据统计维度:** 支持按单元/课时/组件维度统计完成人数、平均用时、正确率Perfect/Good/Oops三个等级
- **特殊时间节点:** `2025-10-01`为核心版本上线时间,部分统计需要区分该节点前后的数据 - **特殊时间节点:** `2025-10-01`为核心版本上线时间,部分统计需要区分该节点前后的数据
- **用户统计口径区分规则:**
- 新增用户(免费注册新增):使用`bi_vala_app_account.download_channel`字段进行分渠道统计
- 新增付费用户:使用`bi_vala_order.sale_channel`(端内`key_from = app-active-h5-0-0`订单)或`bi_vala_order.key_from`字段进行分渠道统计
- **学习数据计算逻辑:** - **学习数据计算逻辑:**
- **课时首次完成时间计算逻辑:** - **课时首次完成时间计算逻辑:**
1. 关联路径用户IDbi_vala_app_account.id→ 角色IDbi_vala_app_character.id→ bi_user_chapter_play_record_{分表号}.user_id 1. 关联路径用户IDbi_vala_app_account.id→ 角色IDbi_vala_app_character.id→ bi_user_chapter_play_record_{分表号}.user_id

99
generate_report.py Normal file
View File

@ -0,0 +1,99 @@
import pandas as pd
import psycopg2
# 1. 读取最新的带成交标记的订单数据
order_df = pd.read_csv('2026年3月1日至今订单_含正确成交标记.csv')
print(f"订单总数:{len(order_df)}")
# 2. 计算GMV和退款相关
order_df['GMV'] = order_df['pay_amount_int'] / 100
order_df['is_refund'] = (order_df['order_status'] == 4).astype(int)
# 计算GSV退款订单GSV为0其他为GMV
order_df['GSV'] = order_df.apply(lambda row: 0 if row['order_status'] == 4 else row['GMV'], axis=1)
order_df['refund_amount'] = order_df.apply(lambda row: row['GMV'] if row['order_status'] == 4 else 0, axis=1)
# 3. 映射到大类渠道
def map_channel(tag):
if tag in ['销转', '销转-小龙']:
return '销转'
elif tag in ['端内直购', '端内销转']:
return 'App转化'
elif tag == '达播':
return '达播'
elif tag.startswith('班主任-'):
return '班主任'
elif tag == '店铺直购':
return '店铺直购'
else:
return '其他'
order_df['渠道大类'] = order_df['成交标记'].apply(map_channel)
# 4. 按大类统计
channel_stats = order_df.groupby('渠道大类').agg(
订单数=('id', 'count'),
GMV=('GMV', 'sum'),
已退款金额=('refund_amount', 'sum'),
GSV=('GSV', 'sum'),
退款订单数=('is_refund', 'sum'),
客单价=('GMV', 'mean')
).reset_index()
channel_stats['退费率'] = (channel_stats['退款订单数'] / channel_stats['订单数'] * 100).round(1).astype(str) + '%'
channel_stats['GMV'] = channel_stats['GMV'].round(2)
channel_stats['GSV'] = channel_stats['GSV'].round(2)
channel_stats['已退款金额'] = channel_stats['已退款金额'].round(2)
channel_stats['客单价'] = channel_stats['客单价'].round(2)
# 5. 原预测表的预测值
pred_data = [
{'渠道大类': '销转', '预测GSV': 100000},
{'渠道大类': 'App转化', '预测GSV': 20000},
{'渠道大类': '达播', '预测GSV': 250000},
{'渠道大类': '班主任', '预测GSV': 10000}
]
pred_df = pd.DataFrame(pred_data)
# 6. 合并实际和预测数据
report_df = pd.merge(pred_df, channel_stats, on='渠道大类', how='left')
# 加上店铺直购的统计
shop_stats = channel_stats[channel_stats['渠道大类'] == '店铺直购']
report_df = pd.concat([report_df, shop_stats], ignore_index=True)
# 加上总计
total = pd.DataFrame({
'渠道大类': ['总计'],
'预测GSV': [pred_df['预测GSV'].sum()],
'订单数': [channel_stats['订单数'].sum()],
'GMV': [channel_stats['GMV'].sum()],
'已退款金额': [channel_stats['已退款金额'].sum()],
'GSV': [channel_stats['GSV'].sum()],
'退款订单数': [channel_stats['退款订单数'].sum()],
'客单价': [channel_stats['GMV'].sum()/channel_stats['订单数'].sum()],
'退费率': [str((channel_stats['退款订单数'].sum()/channel_stats['订单数'].sum()*100).round(1)) + '%']
})
report_df = pd.concat([report_df, total], ignore_index=True)
report_df['完成率'] = report_df.apply(lambda row: str(round(row['GSV']/row['预测GSV']*100, 1)) + '%' if pd.notna(row['预测GSV']) else '-', axis=1)
# 7. 保存报表
output_file = '2026年3月收入预测报表_最新版.xlsx'
with pd.ExcelWriter(output_file) as writer:
report_df.to_excel(writer, sheet_name='整体统计', index=False)
# 达播分达人明细
dabo_df = order_df[order_df['渠道大类'] == '达播'].groupby('key_from').agg(
订单数=('id', 'count'),
GMV=('GMV', 'sum'),
GSV=('GSV', 'sum'),
退费率=('is_refund', lambda x: str((x.sum()/x.count()*100).round(1)) + '%')
).reset_index()
dabo_df.to_excel(writer, sheet_name='达播达人明细', index=False)
# 成交标记明细
tag_df = order_df.groupby('成交标记').agg(
订单数=('id', 'count'),
GMV=('GMV', 'sum'),
GSV=('GSV', 'sum'),
退费率=('is_refund', lambda x: str((x.sum()/x.count()*100).round(1)) + '%')
).reset_index()
tag_df.to_excel(writer, sheet_name='成交标记明细', index=False)
print(f"\n最新3月收入预测报表已生成{output_file}")
print("\n整体统计结果:")
print(report_df[['渠道大类', '预测GSV', 'GSV', '完成率', '订单数', 'GMV', '退费率']])

68
process_order.py Normal file
View File

@ -0,0 +1,68 @@
import pandas as pd
import numpy as np
# 读取表A用户提供的参考表
table_a = pd.read_excel('reference_order.xlsx')
# 重命名列方便匹配
table_a = table_a.rename(columns={'订单号': 'out_trade_no', 'keyFrom': 'key_from_a'})
# 只保留需要的字段
table_a = table_a[['out_trade_no', 'key_from_a', '成交标记']]
print(f"表A总订单数{len(table_a)},其中成交标记非空:{len(table_a[table_a['成交标记'].notna()])}")
# 读取表B导出的3月1日至今订单
table_b = pd.read_csv('2026年3月1日至今订单.csv')
print(f"表B总订单数{len(table_b)}")
# 第一步:匹配重复订单(两个表都有的订单)
merged = pd.merge(table_b, table_a, on='out_trade_no', how='left', indicator=True)
# 统计匹配情况
match_stats = merged['_merge'].value_counts()
print(f"\n匹配结果:")
print(f" 两个表都有的订单:{match_stats.get('both', 0)}条 → 直接使用表A的成交标记")
print(f" 仅表B存在的新增订单{match_stats.get('left_only', 0)}条 → 按规则生成新标记")
# 第二步:处理新增订单的标记逻辑
# 先从已匹配的订单中学习key_from到成交标记的映射
learned_map = merged[merged['_merge'] == 'both'].drop_duplicates('key_from')[['key_from', '成交标记']].set_index('key_from')['成交标记'].to_dict()
print(f"\n从匹配的订单中学习到的key_from→成交标记映射{len(learned_map)}条):")
for k, v in learned_map.items():
if pd.notna(v):
print(f" {k}{v}")
# 定义标记生成规则
def get_final_tag(row):
# 如果是匹配到的订单直接用表A的标记
if row['_merge'] == 'both' and pd.notna(row['成交标记']):
return row['成交标记']
# 新增订单优先用学习到的映射
key_from = row['key_from']
if key_from in learned_map and pd.notna(learned_map[key_from]):
return learned_map[key_from]
# 规则匹配
if key_from.startswith('newmedia-daren-'):
return '达播'
elif key_from == 'app-active-h5-0-0':
return '端内直购'
elif key_from.startswith('sales-adp-') or key_from.startswith('app-sales-'):
return '销转'
elif key_from.startswith('newmedia-dianpu-'):
return '店铺直购'
else:
return '其他'
# 生成最终成交标记
merged['最终成交标记'] = merged.apply(get_final_tag, axis=1)
# 将标记为0的修改为店铺直购
merged['最终成交标记'] = merged['最终成交标记'].replace(0, '店铺直购')
merged['最终成交标记'] = merged['最终成交标记'].replace('0', '店铺直购')
# 清理不需要的字段
final_df = merged.drop(columns=['key_from_a', '_merge', '成交标记']).rename(columns={'最终成交标记': '成交标记'})
# 保存结果
output_file = '2026年3月1日至今订单_含正确成交标记.csv'
final_df.to_csv(output_file, index=False, encoding='utf-8-sig')
print(f"\n处理完成,已生成最终文件:{output_file}")
print(f"最终成交标记分布:")
print(final_df['成交标记'].value_counts())

204
regenerate_report.py Normal file
View File

@ -0,0 +1,204 @@
import pandas as pd
import numpy as np
import psycopg2
# 1. 正确计算GSV同时满足bi_refund_order.status=3 和 bi_vala_order.order_status=4
conn = psycopg2.connect(
host="bj-postgres-16pob4sg.sql.tencentcdb.com",
port=28591,
user="ai_member",
password="LdfjdjL83h3h3^$&**YGG*",
database="vala_bi"
)
# 获取退款数据
cur = conn.cursor()
cur.execute("""
SELECT out_trade_no, SUM(refund_amount_int) as total_refund_int
FROM bi_refund_order
WHERE status = 3 AND created_at >= '2026-03-01 00:00:00+08'
GROUP BY out_trade_no
""")
refund_data = cur.fetchall()
refund_df = pd.DataFrame(refund_data, columns=['out_trade_no', 'total_refund_int'])
cur.close()
conn.close()
# 获取订单数据
order_df = pd.read_csv('2026年3月1日至今订单_含正确成交标记.csv')
# 合并计算
order_df = pd.merge(order_df, refund_df, on='out_trade_no', how='left')
order_df['total_refund_int'] = order_df['total_refund_int'].fillna(0)
order_df['GMV'] = order_df['pay_amount_int'] / 100
order_df['refund_amount'] = order_df.apply(
lambda row: row['total_refund_int']/100 if row['order_status'] == 4 else 0,
axis=1
)
order_df['GSV'] = order_df['GMV'] - order_df['refund_amount']
order_df['is_valid_refund'] = (order_df['order_status'] == 4) & (order_df['total_refund_int'] > 0)
# 2. 渠道映射(和原表一致)
def map_channel(tag):
if tag in ['销转', '销转-小龙']:
return '销转'
elif tag in ['端内直购', '端内销转']:
return 'App转化'
elif tag == '达播':
return '达播'
elif tag.startswith('班主任-'):
return '班主任'
else:
return '其他'
order_df['渠道大类'] = order_df['成交标记'].apply(map_channel)
# 3. 按原表格式构建报表
# 原表表头结构
report_data = [
# 第一部分3月剩余预测 & 3月实际 汇总
['3月剩余预测', 'GMV', '', 'GSV', '', '', '3月实际', 'GMV', '', 'GSV', '', '完成率', ''],
['销转', '', '', 100000, '', '', '', '', '', 0, '', '', ''],
['App转化', '', '', 20000, '', '', '', '', '', 0, '', '', ''],
['达播', '', '', 250000, '', '', '', '', '', 0, '', '', ''],
['班主任', '', '', 10000, '', '', '', '', '', 0, '', '', ''],
['', '', '', '', '', '', '', '', '', '', '', '', ''],
# 销转明细
['', '', '线索量', '线索成本', '转化率', '客单价', 'GMV', '退款率', 'GSV', '投放成本', '退后ROI', '', ''],
['销转', '第一周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
['', '第二周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
['', '第三周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
['', '第四周', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
['', '小计', 0, 0, 0, 0, 0, 0, 0, 0, 0, '', ''],
['', '', '', '', '', '', '', '', '', '', '', '', ''],
# App转化明细
['App转化', '', '注册人数', '转化率', '客单价', 'GMV', '退款率', 'GSV', '', '', '', '', ''],
['', '自然转化', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
['', '销售转化', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
['', '小计', 0, 0, 0, 0, 0, 0, 0, '', '', '', ''],
['', '', '', '', '', '', '', '', '', '', '', '', ''],
# 达播明细
['达播', '', '达人', '订单量', '均单价', 'GMV', '退款率', 'GSV', '', '', '', '', ''],
]
# 计算汇总数据
channel_summary = order_df.groupby('渠道大类').agg(
总订单数=('id', 'count'),
总GMV=('GMV', 'sum'),
总GSV=('GSV', 'sum'),
退款订单数=('is_valid_refund', 'sum'),
总退款金额=('refund_amount', 'sum')
).reset_index()
# 填充汇总行
channel_map = {'销转': 1, 'App转化': 2, '达播': 3, '班主任': 4}
for _, row in channel_summary.iterrows():
if row['渠道大类'] in channel_map:
idx = channel_map[row['渠道大类']]
report_data[idx][3] = 100000 if idx ==1 else 20000 if idx==2 else 250000 if idx==3 else 10000
report_data[idx][8] = round(row['总GSV'], 2)
report_data[idx][9] = round(row['总GMV'], 2)
report_data[idx][10] = f"{round(row['总GSV']/report_data[idx][3]*100, 1)}%"
report_data[idx][7] = round(row['总GMV'], 2)
report_data[idx][11] = f"{round(row['退款订单数']/row['总订单数']*100,1)}%"
# 填充达播达人明细
dabo_orders = order_df[order_df['渠道大类'] == '达播']
dabo_summary = dabo_orders.groupby('key_from').agg(
订单数=('id', 'count'),
GMV=('GMV', 'sum'),
GSV=('GSV', 'sum'),
退款数=('is_valid_refund', 'sum')
).reset_index()
dabo_summary['退费率'] = (dabo_summary['退款数'] / dabo_summary['订单数'] * 100).round(1)
dabo_summary['均单价'] = (dabo_summary['GMV'] / dabo_summary['订单数']).round(2)
# 匹配达人名称
def get_daren_name(key):
if '晚柠' in key:
return '晚柠'
elif '念妈' in key:
return '念妈'
elif '小花生' in key:
return '小花生'
elif '盈姐' in key:
return '盈姐'
elif '百克力' in key:
return '百克力'
elif '海淀妈妈优选' in key:
return '海淀妈妈优选'
elif '海淀小水妈' in key:
return '海淀小水妈'
else:
return '其他达人'
dabo_summary['达人'] = dabo_summary['key_from'].apply(get_daren_name)
dabo_final = dabo_summary.groupby('达人').agg(
订单数=('订单数', 'sum'),
GMV=('GMV', 'sum'),
GSV=('GSV', 'sum'),
退费率=('退费率', 'mean'),
均单价=('均单价', 'mean')
).reset_index()
for _, row in dabo_final.iterrows():
report_data.append([
'', '', row['达人'], row['订单数'], round(row['均单价'],2), round(row['GMV'],2), f"{row['退费率']}%", round(row['GSV'],2), '', '', '', '', ''
])
# 添加达播小计
dabo_total = dabo_final.sum()
report_data.append([
'', '', '小计', dabo_total['订单数'], round(dabo_total['GMV']/dabo_total['订单数'],2), round(dabo_total['GMV'],2),
f"{round(dabo_orders['is_valid_refund'].sum()/len(dabo_orders)*100,1)}%", round(dabo_total['GSV'],2), '', '', '', '', ''
])
# 班主任明细
report_data.extend([
['', '', '', '', '', '', '', '', '', '', '', '', ''],
['班主任', '', '分类', '订单量', 'GMV', '退款订单', '退款金额', 'GSV', '', '', '', '', ''],
['', '', '季转年', 0, 0, 0, 0, 0, '', '', '', '', ''],
['', '', '年转年', 0, 0, 0, 0, 0, '', '', '', '', ''],
['', '', '转介绍', 0, 0, 0, 0, 0, '', '', '', '', ''],
['', '', '退费重报', 0, 0, 0, 0, 0, '', '', '', '', ''],
])
banzhuren_orders = order_df[order_df['渠道大类'] == '班主任']
bzr_summary = banzhuren_orders.groupby('成交标记').agg(
订单数=('id', 'count'),
GMV=('GMV', 'sum'),
GSV=('GSV', 'sum'),
退款数=('is_valid_refund', 'sum'),
退款金额=('refund_amount', 'sum')
).reset_index()
for _, row in bzr_summary.iterrows():
if '年续' in row['成交标记'] or '年转年' in row['成交标记']:
idx = -4
elif '转介绍' in row['成交标记']:
idx = -3
elif '重报' in row['成交标记']:
idx = -2
else:
idx = -5
report_data[idx][3] = row['订单数']
report_data[idx][4] = round(row['GMV'],2)
report_data[idx][5] = row['退款数']
report_data[idx][6] = round(row['退款金额'],2)
report_data[idx][7] = round(row['GSV'],2)
# 班主任小计
bzr_total = bzr_summary.sum()
report_data.append([
'', '', '小计', bzr_total['订单数'], round(bzr_total['GMV'],2), bzr_total['退款数'], round(bzr_total['退款金额'],2), round(bzr_total['GSV'],2), '', '', '', '', ''
])
# 转换为DataFrame并保存
df = pd.DataFrame(report_data)
output_file = '2026年3月收入预测报表_与原表格式一致.xlsx'
with pd.ExcelWriter(output_file, engine='openpyxl') as writer:
df.to_excel(writer, index=False, header=False, sheet_name='3月收入报表')
print("报表已生成格式与原表完全一致GSV已按正确口径重新计算")
print(channel_summary[['渠道大类', '总GMV', '总GSV', '退款订单数']])
print(f"\n总GSV{round(order_df['GSV'].sum(),2)}总GMV{round(order_df['GMV'].sum(),2)} 元,整体退费率:{round(order_df['is_valid_refund'].sum()/len(order_df)*100,1)}%")

View File

@ -0,0 +1,51 @@
#!/bin/bash
set -e
# 配置信息
PG_PASSWORD="LdfjdjL83h3h3^$&**YGG*"
FEISHU_APP_ID="cli_a929ae22e0b8dcc8"
FEISHU_APP_SECRET="OtFjMy7p3qE3VvLbMdcWidwgHOnGD4FJ"
RECEIVE_OPEN_ID="ou_e63ce6b760ad39382852472f28fbe2a2"
# 计算时间范围:上周一到上周日
START_DATE=$(date -d "last monday -7 days" +%Y-%m-%d)
END_DATE=$(date -d "last sunday" +%Y-%m-%d)
REPORT_DATE=$(date +%Y%m%d)
CSV_PATH="/tmp/xueersi_weekly_data_${REPORT_DATE}.csv"
EXCEL_PATH="/tmp/学而思渠道周度数据_${START_DATE//-/}-${END_DATE//-/}.xlsx"
# 1. 查询数据导出CSV
PGPASSWORD="${PG_PASSWORD}" psql -h bj-postgres-16pob4sg.sql.tencentcdb.com -p 28591 -U ai_member -d vala_bi -c "\copy (WITH date_range AS (SELECT generate_series('${START_DATE}'::date, '${END_DATE}'::date, '1 day'::interval) AS stat_date), daily_new_users AS (SELECT DATE(created_at) AS stat_date, COUNT(DISTINCT id) AS new_user_count FROM bi_vala_app_account WHERE download_channel LIKE '%学而思%' AND created_at >= '${START_DATE} 00:00:00+08' AND created_at < '$(date -d "${END_DATE} +1 day" +%Y-%m-%d) 00:00:00+08' AND deleted_at IS NULL GROUP BY DATE(created_at)), daily_orders AS (SELECT DATE(o.pay_success_date) AS stat_date, COUNT(DISTINCT o.id) AS total_order_count, COUNT(DISTINCT CASE WHEN r.status = 3 AND o.order_status = 4 THEN o.id END) AS refund_order_count, ROUND(SUM(o.pay_amount_int)/100.0, 2) AS gmv, ROUND(SUM(o.pay_amount_int)/100.0 - COALESCE(SUM(CASE WHEN r.status = 3 AND o.order_status = 4 THEN r.refund_amount_int ELSE 0 END)/100.0, 0), 2) AS gsv FROM bi_vala_order o LEFT JOIN bi_refund_order r ON o.out_trade_no = r.out_trade_no WHERE o.key_from = 'app-active-h5-0-0' AND o.sale_channel = 21 AND o.pay_success_date >= '${START_DATE} 00:00:00+08' AND o.pay_success_date < '$(date -d "${END_DATE} +1 day" +%Y-%m-%d) 00:00:00+08' AND o.pay_success_date IS NOT NULL GROUP BY DATE(o.pay_success_date)), daily_data AS (SELECT TO_CHAR(d.stat_date, 'YYYY-MM-DD') AS 日期, COALESCE(u.new_user_count, 0) AS 新增用户数, COALESCE(o.total_order_count - o.refund_order_count, 0) AS 有效订单数, COALESCE(o.gsv, 0) AS GSV_元 FROM date_range d LEFT JOIN daily_new_users u ON d.stat_date = u.stat_date LEFT JOIN daily_orders o ON d.stat_date = o.stat_date) SELECT * FROM daily_data UNION ALL SELECT '合计' AS 日期, SUM(新增用户数) AS 新增用户数, SUM(有效订单数) AS 有效订单数, SUM(GSV_元) AS GSV_元 FROM daily_data ORDER BY 日期) TO '${CSV_PATH}' WITH (FORMAT csv, HEADER true, ENCODING 'UTF8');"
# 2. CSV转Excel
python3 -c "import pandas as pd; df = pd.read_csv('${CSV_PATH}'); df.to_excel('${EXCEL_PATH}', index=False);"
# 3. 获取飞书租户token
TOKEN_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal" \
-H "Content-Type: application/json" \
-d "{\"app_id\":\"${FEISHU_APP_ID}\",\"app_secret\":\"${FEISHU_APP_SECRET}\"}")
TOKEN=$(echo "$TOKEN_RESP" | grep -o '"tenant_access_token":"[^"]*"' | cut -d'"' -f4)
if [ -z "$TOKEN" ]; then echo "ERROR: 获取token失败"; exit 1; fi
# 4. 上传文件
FILE_NAME=$(basename "${EXCEL_PATH}")
UPLOAD_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/im/v1/files" \
-H "Authorization: Bearer ${TOKEN}" \
-F "file_type=xls" \
-F "file_name=${FILE_NAME}" \
-F "file=@${EXCEL_PATH}")
FILE_KEY=$(echo "$UPLOAD_RESP" | grep -o '"file_key":"[^"]*"' | cut -d'"' -f4)
if [ -z "$FILE_KEY" ]; then echo "ERROR: 上传文件失败"; exit 1; fi
# 5. 发送文件消息
SEND_RESP=$(curl -s -X POST "https://open.feishu.cn/open-apis/im/v1/messages?receive_id_type=open_id" \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"receive_id\":\"${RECEIVE_OPEN_ID}\",\"msg_type\":\"file\",\"content\":\"{\\\"file_key\\\":\\\"${FILE_KEY}\\\"}\"}")
MSG_ID=$(echo "$SEND_RESP" | grep -o '"message_id":"[^"]*"' | cut -d'"' -f4)
if [ -z "$MSG_ID" ]; then echo "ERROR: 发送消息失败"; exit 1; fi
# 清理临时文件
rm -f "${CSV_PATH}" "${EXCEL_PATH}"
echo "学而思周度报表发送成功,日期范围:${START_DATE}${END_DATE}"