最近做过 Hbase 散列后 ,在测试提取 Hive 映射表数据时 ,提交过测试 SQL 之后一直卡住不动 。
通过观察 Exceutor 发现 ,Exceutor 已经准备就绪 ,在将数据发送回 Driver 端后停了 5 分多钟 ,于是感觉问题应该是 Driver 端 ( 也考虑也是不是环境 Jar 上传的速度比较慢 ,后来发现并不是 )

在 Driver 端并没有发现 Driver 有假死的现象 ,上传 Jar 包时间也就 1s ( 应该是本身的环境就存在 Jar )

继续观察发现原因所在 ,在计算 RoginSize 时 一次花费 15s 的时间 ,由于 SQL 中共需扫描 17 个 RegionSize 范围 ,所以时间就是 17 * 15 s = 255 s ,正好对应提交时间

这个是实际上任务的提交时间

在 Hive Shell 端显示时间为 555.57s = ( 实际运行时间 4.9 min * 60 s + 计算 RegionSize 时间 17 * 15 s ) = 549s + 一些其他的时间花费 6s

整体执行速度在未散列前大概是 12 分钟 ,散列后的效率提升还是很明显的

INSERT OVERWRITE TABLE trace.t_test_wx_recommend_log PARTITION (opera_day = 20200406)
SELECT recommend_type,
       opera_time,
       phone_number,
       activity_id,
       rule_id,
       channel_code,
       recommend_position,
       goods1,
       goods2,
       goods3,
       goods4,
       goods5,
       goods6,
       goods7,
       goods8,
       goods9,
       goods10,
       goods11,
       goods12,
       goods13,
       goods14,
       goods15,
       goods16,
       goods17
FROM (
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00020200406'
           AND key < '00020200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00120200406'
           AND key < '00120200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00220200406'
           AND key < '00220200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00320200406'
           AND key < '00320200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00420200406'
           AND key < '00420200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00520200406'
           AND key < '00520200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00620200406'
           AND key < '00620200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00720200406'
           AND key < '00720200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00820200406'
           AND key < '00820200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '00920200406'
           AND key < '00920200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01020200406'
           AND key < '01020200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01120200406'
           AND key < '01120200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01220200406'
           AND key < '01220200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01320200406'
           AND key < '01320200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01420200406'
           AND key < '01420200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01520200406'
           AND key < '01520200407'
         UNION ALL
         SELECT *
         FROM trace.t_test_wx_recommend_log_map
         WHERE key >= '01620200406'
           AND key < '01620200407') TEMP;


时至今日,你依旧是我的光芒。