Spark Scala Study

val arr= sc.parallelize(Array(("A",1),("B:,2),("C",3)))

arr.flatmap(x=>(x._1+x._2)).foreach(println)

A 1 B . . . .

It just like double map opeartion

for map. val arr= sc.parallelize(Array(("A",1),("B:,2),("C",3)))

arr.map(x=>(x._1+x._2)).foreach(println)

A1 B2 C3

How to count A;B;C;D;B;C;D B;D;A;E;D;C A;B

data.map(.split(";")).flatMap(x=>{ for(i<-0 until x.length-1)yield(x(i)+","+x(i+1),1) }).reduceByKey(+_)foreach(println)

do for first then FlatMap the set.

yield ---

Each time for loop, yield will creat a value, then record by random, it just like a buffer, When loop end, it will return all yield value's set. return set should be same typle of loop set.