最近做过 Hbase 散列后 ,在测试提取 Hive 映射表数据时 ,提交过测试 SQL 之后一直卡住不动 。
通过观察 Exceutor 发现 ,Exceutor 已经准备就绪 ,在将数据发送回 Driver 端后停了 5 分多钟 ,于是感觉问题应该是 Driver 端 ( 也考虑也是不是环境 Jar 上传的速度比较慢 ,后来发现并不是 )
在 Driver 端并没有发现 Driver 有假死的现象 ,上传 Jar 包时间也就 1s ( 应该是本身的环境就存在 Jar )
继续观察发现原因所在 ,在计算 RoginSize 时 一次花费 15s 的时间 ,由于 SQL 中共需扫描 17 个 RegionSize 范围 ,所以时间就是 17 * 15 s = 255 s ,正好对应提交时间
这个是实际上任务的提交时间
在 Hive Shell 端显示时间为 555.57s = ( 实际运行时间 4.9 min * 60 s + 计算 RegionSize 时间 17 * 15 s ) = 549s + 一些其他的时间花费 6s
整体执行速度在未散列前大概是 12 分钟 ,散列后的效率提升还是很明显的
INSERT OVERWRITE TABLE trace.t_test_wx_recommend_log PARTITION (opera_day = 20200406)
SELECT recommend_type,
opera_time,
phone_number,
activity_id,
rule_id,
channel_code,
recommend_position,
goods1,
goods2,
goods3,
goods4,
goods5,
goods6,
goods7,
goods8,
goods9,
goods10,
goods11,
goods12,
goods13,
goods14,
goods15,
goods16,
goods17
FROM (
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00020200406'
AND key < '00020200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00120200406'
AND key < '00120200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00220200406'
AND key < '00220200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00320200406'
AND key < '00320200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00420200406'
AND key < '00420200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00520200406'
AND key < '00520200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00620200406'
AND key < '00620200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00720200406'
AND key < '00720200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00820200406'
AND key < '00820200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '00920200406'
AND key < '00920200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01020200406'
AND key < '01020200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01120200406'
AND key < '01120200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01220200406'
AND key < '01220200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01320200406'
AND key < '01320200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01420200406'
AND key < '01420200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01520200406'
AND key < '01520200407'
UNION ALL
SELECT *
FROM trace.t_test_wx_recommend_log_map
WHERE key >= '01620200406'
AND key < '01620200407') TEMP;
Comments | NOTHING