# Semigroup And Monoid In Scala

This post is a sequel to Monad In Scala and Applicative vs Monad In Scala.

In order to learn what `Monoid`

is, we first need to look at what `Semigroup`

is first.

## Semigroup

`Semigroup`

is a typeclass that defines `combine`

method (a.k.a `append`

in Scalaz etc). Naming convention for such method seems to vary per each library, yet the essence is straightforward. It’s a method that **combine**s two `semigroup`

instances.

`Int`

is an instance of `Semigroup`

and its *analogous* combine method is `+`

(add two integers).

`String`

is an instance `Semigroup`

and its *analogous* combine method is `+`

(concatenate two strings).

Note the word *analogous* since they aren’t the methods defined for `Semigroup`

per se, these standard library methods simply happen to resemble what `Semigroup`

’s combine method does.

So then now we wonder, what’s the use of this typeclass when all it does is to combine stuff, which is so common in programming that every language’s standard library already implements them from the get-go?

Like most other typeclass (except the ones like `Monad`

), they **compose**. This is immensely useful.

Let’s look `List[String]`

.

```
Semigroup[List[String]].combine(List("hello"), List("world")) // List("hello", "world")
List("hello") ++ List("world") // List("hello", "world")
```

Okay both Semigroup version and standard library version do the same thing. What about `Map[String, String]`

?

```
val map1 = Map("foo" -> "hello"); val map2 = Map("foo" -> "world")
Semigroup[Map[String, String]].combine(map1, map2) // Map("foo" -> "helloworld")
map1 ++ map2 // Map("foo" -> "world")
```

This is interesting. The reason that they differ is because `Semigroup`

**compose**s while Map’s `++`

operator doesn’t. Before we dive deeper, there is an operator for `combine`

in `Semigroup`

: `|+|`

. So from now on, we can call `map1 |+| map2`

instead.

When we call `map1 |+| map2`

, whose type is `Map[String, String]`

, given that `Semigroup`

knows how to combine `Map[String, V]`

and `String`

(that is, there is `Semigroup`

defined for each), it is able to work on `Map[String, String]`

as well (which is to say that they are `composable`

). An abstract implementation of `Map[K, V]`

is:

```
class MapSemigroup[K, V](implicit V: Semigroup[V]) extends Semigroup[Map[K, V]] {
def combine(x: Map[K, V], y: Map[K, V]): Map[K, V] = addMap(x, y)(V.combine)
}
```

All we need to understand is that it takes `Semigroup[V]`

, which in our case is `Semigroup[String]`

and pass its `combine`

method to opaque `addMap`

function that does the work of combining. Thus, `Semigroup[Map[String, String]]`

has access to both its own implementation of `combine`

and also the implementation of `combine`

of its `V`

, enabling the composition.

Compare this to `map1 ++ map2`

which defines a monolithic definition of simply replacing the duplicate key’s values instead of applying custom implementation for those values. This is due to lack of **compositionality**.

To see its true usefulness, let me throw this at you:

```
Map("foo" -> Map("foo1" -> "v1"), "bar" -> Map("bar1" -> "v2")) |+| Map("foo" -> Map("foo1" -> "v3"))
// Map("foo" -> Map("foo1" -> "v1v3"), "bar" -> Map("bar1" -> "v2"))
```

## Monoid

Conceptually, `Monoid`

is to `Semigroup`

like `Applicative`

is to `Apply`

. All it does is to define an additional method `empty`

(a.k.a `zero`

in Scalaz) which returns an **identity** value. When you **combine** identity value (which is in and of itself, a `Monoid`

instance) with another `Monoid`

instance, it simply returns the combined `Monoid`

in return. It applies no change to it.

Think of an empty string `""`

as an example. Combining empty string with string `"Hello"`

returns `"Hello"`

.

This is a simple extension to `Semigroup`

but derived power from this is immense.

What this simple addition allows us to do is an ability to fall back to default for possibly empty values as well as an ability to `fold`

.

It allows us to implement useful typeclass like `Foldable`

, which is beyond the scope of this post.

Let’s look at example from algebra library:

```
// To be able to combine 3 strings
// contrived version in Seimgroup
Semigroup[String].combine("foo", Semigroup[String].combine("bar", "baz"))
// Monoid
Monoid[String].combineAll(List("foo", "bar", "baz"))
```

In algebra library, `combineAll`

method internally calls `fold`

. `fold`

ing over `Semigroup`

is not possible since we don’t have an initial value to start `fold`

from. This is solved by using `Monoid`

which now has an `empty`

method which returns an identity value, which can happily be used as initial value.

In fact, `fold`

method in `Foldable`

(which is a typeclass that defines `fold`

-like methods) requires an instance of `Monoid`

through implicit parameters.

`Monoid`

allows us to work at higher abstraction where we can combine an **non-deterministic** values (`List`

is an example) through the use of default value, that is, `empty`

. In `Semigroup`

, values are fully **deterministic** in that we specify them upfront using notations like `|+|`

. We cannot pass a varying number of values contained within, say, `List`

to it. If the `List`

happens to be empty, then it will simply *fall back* to its `empty`

method.

In context of functional programming, non-determinism is quite abstract and takes longer post than this to explain. Though you can refer to wiki here.

### Takeaway

Use `Semigroup`

when you **can** explicitly define all values involved (**deterministic**) to combine. Think of simple addition of **two** integers.

Use `Monoid`

when you **cannot** explicitly define all values involved (**non-deterministic**) to combine. Think of `fold`

over a **List** of integers. (`Foldable`

’s `fold`

method is defined through `Monoid`

’s `empty`

and `combine`

methods)

But for all purposes, there is rarely a reason to use lesser-compentent, yet more freeing, `Semigroup`

over `Monoid`

(Unlike the relationship between `Applicative`

and `Monad`

) since `Monoid`

is in fact, a strict superset of `Semigroup`

and can access all its methods without penalties.

You can safely default to `Monoid`

for most.