Item: Consider aggregating elements to a map

Marcin Moskala
Kt. Academy
Published in
4 min readApr 20, 2020

--

It is not uncommon to have a bigger set of elements we need to access many times. It might be:

  • A cache — data that we download from some service and then keep in local memory to access them more quickly
  • A repository with data loaded from some file
  • An in-memory repository used for different kinds of tests

Those data might represent some list of users, ids, configurations, etc. They are generally given to us as a list, and it is tempting to represent them in our memory the same way:

class NetworkUserRepo(val userService: UserService): UserRepo {
private var users: List<User>? = null
override fun getUser(id: UserId): User? {
if(users == null) {
users = userService.getUsers()
}
return users?.firstOrNull { it.id == id }
}
}

class ConfigurationsRepository(
val configurations: List<Configuration>
) {
fun getByName(name: String) = configurations
.firstOrNull { it.name == name }
}
class InMemoryUserRepo: UserRepo {
private val users: MutableList<User> = mutableListOf()
override fun getUser(id: UserId): User?
= users.firstOrNull { it.id == id }

fun addUser(user: User) {
user.add(user)
}
}

This is rarely the best way to store those elements. Notice how those data we loaded are used. We most often access an element by some identifier or name (it is connected to how we design data to have unique keys in databases). Finding an element in a list has O(n) complexity, where n is the size of the list. More concretely, it takes on average n/2 comparisons to find an element. It is especially problematic for bigger lists. A good solution to this problem is to use Map instead of a List. Kotlin by default uses a hash map (concretely LinkedHashMap), and as we described in Item 41: Respect the contract of hashCode, the performance of finding an element when we use the hash map is much better. Actually, in JVM, thanks to the fact that the size of the used hash map is adjusted to the size of the map itself, if hashCode is implemented properly, finding an element should take one comparison only.

This is InMemoryRepo using a map instead of a list:

class InMemoryUserRepo: UserRepo {
private val users: MutableMap<UserId, User> = mutableMapOf()
override fun getUser(id: UserId): User? = users[id]

fun addUser(user: User) {
user.put(user.id, user)
}
}

Most other operations, like modifying or iterate over those data (likely using collection processing methods like filter, map, flatMap, sorted, sum, etc.) have more or less the same performance for the standard list and map.

The thing is how do we transform from a list to a map and the other way around. For that we use associate functions. The most common one is associateBy that builds a map where values are the elements from the list, and keys are produced on the lambda expression:

data class User(val id: Int, val name: String)
val users = listOf(User(1, "Michal"), User(2, "Marek"))
val byId = users.associateBy { it.id }
byId == mapOf(1 to User(1, "Michal"), 2 to User(2, "Marek"))
val byName = users.associateBy { it.name }
byName == mapOf("Michal" to User(1, "Michal"),
"Marek" to User(2, "Marek"))

Notice that keys in maps must be unique or duplicates are removed. This is why we should only associate by a unique identifier (to group by something that is not unique, use groupBy).

val users = listOf(User(1, "Michal"), User(2, "Michal"))val byName = users.associateBy { it.name }
byName == mapOf("Michal" to User(2, "Michal"))

To transform the other way around, you can use values:

val users = listOf(User(1, "Michal"), User(2, "Michal"))
val byId = users.associateBy { it.id }
users == byId.values

This is how the other repositories can be implemented using a map to improve the performance of elements access:

class NetworkUserRepo(val userService: UserService): UserRepo {
private var users: Map<UserId, User>? = null
override fun
getUser(id: UserId): User? {
if(users == null) {
users = userService.getUsers().associateBy { it.id }
}
return users?.get(id)
}
}

class ConfigurationsRepository(
configurations: List<Configuration>
) {
val configurations: Map<String, Configuration> =
configurations.associateBy { it.name }

fun
getByName(name: String) = configurations[name]
}

This technique is important, but it is not for all the cases. It is more useful when we often need to access those elements. This is why it is especially important on backend where those collections might be accessed many times per second. It is not so important on the frontend (by that I mean also Android or iOS) where a user will access this repository at most a few times. We also need to remember that mapping from a list to a map takes some time as well, so if we do it often, it might hurt our performance as well.

Click the 👏 to say “thanks!” and help others find this article.

To be up-to-date with great news on Kt. Academy, subscribe to the newsletter, observe Twitter and follow us on medium.

If you need a Kotlin workshop, check how we can help you: kt.academy.

--

--

Kt. Academy creator, co-author of Android Development with Kotlin, author of open-source libraries, community activist. http://marcinmoskala.com/