CodeFormat
1.脚本跑的时候,要注意不能用tab换行,而是用空格键,否则会导致代码解析错误。
2.在做join时,要注意更换列名,不能使得多个表中有同样的列名,否则会报错
FAILED: SemanticException Column mid Found in more than One Tables/Subqueries #所以将下面代码中的mid都更改
hive -e"set hive.cli.print.header=true;
set mapreduce.job.queuename=root.baseDepSarchQueue;
select mid,lvt,z,a,b,ehc,fo,sap,svar1,kw,ivar5,svar2,ft,sn,sh,sty,ivar2,reason,tv1,fs,r
from temp.jlist_keyword_ard_d as torigin inner join
(
select Tcnt.cntmid as allmid
from
(
select mid as cntmid, count(*) as cnt from temp.jlist_keyword_ard_d
where dt = '2017-07-20'
and song = 'xihuannidengziqi'
group by mid
having cnt < 300
)Tcnt inner join
(
select Tkw.kwmid as searchmid
from
(
select mid as kwmid from temp.jlist_keyword_ard_d
where dt = '2017-07-20'
and song = 'xihuannidengziqi'
and kw = '喜欢你邓紫棋'
group by mid
)Tkw inner join
(
select mid as amid from temp.jlist_keyword_ard_d
where dt = '2017-07-20'
and song = 'xihuannidengziqi'
and a = '1428'
group by mid
)Ta
on Tkw.kwmid = Ta.amid
)Tsearch
on Tcnt.cntmid = Tsearch.searchmid
)tall
on (torigin.mid = tall.allmid and torigin.dt = '2017-07-20' and torigin.song = 'xihuannidengziqi');">xihuannidengziqi-2017-07-20-test.txt
2.在传入字符串到hive脚本时,如果有单引号,需要注意加入双斜杠进行转义,因为双斜杠先会被转义为单斜杠,再与单引号转义为引号
val input="('don\\'t say goodbye','战神榜')"
val sql_original=s"""select 1 as label, query, dt, sum(search_count) as cnt from (
select inputstring as query, dt, count(is_valid) as search_count, 'ard' as plat
from ddl.dt_search_ard_d
where dt >= '$date_start' and dt <= '$date_end'
and inputtype in ('1','2','3','4','6','7','8')
and inputstring in $input
group by inputstring, dt
union all
select keyword as query, dt, count(valid) as search_count, 'pc' as plat
from ddl.dt_search_pc_d
where dt >= '$date_start' and dt <= '$date_end'
and inputtype in ('1','2','3','4','5','6','8')
and keyword in $input
group by keyword, dt
union all
select inputstring as query, dt, count(is_valid) as search_count, 'ios' as plat
from ddl.dt_search_ios_d
where dt >= '$date_start' and dt <= '$date_end'
and inputtype in ('1','2','3','4','6','7','8')
and inputstring in $input
group by inputstring, dt
)triple_count
group by query, dt"""
Last updated
Was this helpful?