Need to merge multiple maps by their keys and make manipulation on the values? this post is probably for you!
Why ?
previous experience with map reduce, many times data collected was in a Map structure, found myself repeatedly summing/appending/listing/manipulating maps
- Assuming we have 2 maps with names of people as the key and the amount of them as the value :
1 2 3 |
//each map contains count of people names val names = Map("Sidney" -> 1, "Paul" -> 1, "Jacob" -> 7) val moreNames = Map("Sidney" -> 1, "Paul" -> 5, "Nick" -> 2) |
- For some reason we need to merge the 2 maps into 1 expecting to sum the values and get all keys.
1 2 |
//need to merge them to 1 map that sums all people names //expecting : Map("Sidney" -> 2, "Paul" -> 6, "Nick" -> 2, "Jacob" -> 7) |
- Lets write our own implementation for this task :
1 2 3 |
//note ++ on maps will get all keys with second maps values if exists, if not first map values. val mergedMap = names ++ moreNames.map { case (name,count) => name -> (count + names.getOrElse(name,0)) } |
- Next task: lets merge the maps but instead of summing the values lets list them
1 2 3 4 |
//expecting : Map("Sidney" -> List(1,1), "Paul" -> List(5,1), "Nick" -> List(2), "Jacob" -> List(7)) val mergedMap2 = (names.toSeq ++ moreNames.toSeq) .groupBy{case(name,amount) => name} .mapValues(person => person.map{ case(name,amount) => amount}.toList) |
- Ok that was nice, but can be achieved much easier using Scalaz semigroups
1 2 |
//in your projects dependencies val scalazDep = Seq (org.scalaz %% scalaz-core % 7.0.6) |
1 2 3 4 5 6 |
//now lets use scalaz Semigroup concept //lets merge maps by key and sum the values ! import scalaz.Scalaz._ val mergedMap3 = names |+| moreNames //lets merge maps by key and list the values val mergedMap4 = names.map(p => p._1 -> List(p._2)) |+| moreNames.map(p => p._1 -> List(p._2)) |
Main idea with using scalaz |+| is to arrange the data structure to what you need to do. in case you need to sum values a int will do the job. case you need to list the values a list of Any will do the trick.
This example really comes in handy when used in reduce functions, resulting in clean and short code (if you aren’t a scalaz hater)
1 2 3 4 5 |
//each map contains count of people names val names = Map("Sidney" -> 1, "Paul" -> 1, "Jacob" -> 7) val moreNames = Map("Sidney" -> 1, "Paul" -> 5, "Nick" -> 2) //Money time! val test = List(names , moreNames).reduce(_ |+| _) |