Requirement is to reverse the Explode operation to convert the string into array values on Spark Dataframe.
Code snippet to unit test is given below.
test("Reverse-explode operation") {
import spark.implicits._
val arrayData = Seq(
Row("James", "Blue", "Java"),
Row("James", "Blue", "Spark"))
val arraySchema = new StructType()
.add("name",StringType)
.add("Color",StringType)
.add("knownLanguages", StringType)
val df = spark.createDataFrame(spark.sparkContext.parallelize(arrayData),arraySchema)
df.printSchema()
df.show(false)
df.groupBy("name", "color")
.agg(collect_list("knownLanguages").alias("knownLanguages"))
.show(false)
}
Hope this helps to handle reverse explode related usecase!