Spark Scala Study
val arr= sc.parallelize(Array(("A",1),("B:,2),("C",3)))
arr.flatmap(x=>(x._1+x._2)).foreach(println)
A 1 B . . . .
It just like double map opeartion
for map.
val arr= sc.parallelize(Array(("A",1),("B:,2),("C",3)))
arr.map(x=>(x._1+x._2)).foreach(println)
A1 B2 C3
How to count A;B;C;D;B;C;D B;D;A;E;D;C A;B
data.map(.split(";")).flatMap(x=>{ for(i<-0 until x.length-1)yield(x(i)+","+x(i+1),1) }).reduceByKey(+_)foreach(println)
do for first then FlatMap the set.
yield ---
Each time for loop, yield will creat a value, then record by random, it just like a buffer, When loop end, it will return all yield value's set. return set should be same typle of loop set.