Throughout this introductory chapter, we've mentioned a couple of times that Elixir has protocols, with Enumerable
being one of the examples. In this section, we'll dive into protocols and even define our own!
Protocols, like the behaviours we've seen in the last section, define a set of functions that have to be implemented. In that sense, both constructs serve as a way to achieve polymorphism in Elixir–being able to display multiple forms of behavior, but all linked to a single interface. While behaviours define a set of functions that a module needs to implement, and are thus tied to a module, protocols define a set of functions that a data type must implement. This means that, with protocols, we have data type polymorphism, and we're able to write functions that behave differently depending on the type of their arguments.
Let's now see how we can create a new protocol. We'll pick up, and extend, the example present in the official Getting Started guide (at http://elixir-lang.github.io/getting-started/protocols.html). We will define a Size
protocol, which will be implemented by each data type. To define a new protocol, we use the defprotocol
construct:
$ cat examples/size.ex defprotocol Size do @doc "Calculates the size of a data structure" def size(data) end
We're stating that our Size
protocol expects the data types that will implement it must define a size/1
function, where the argument is the data structure we want to know the size of.
You can use the @doc
directive to add documentation to this function, as you normally do with named functions inside modules. We can now define the implementation of this protocol for the data types we're interested in, using the defimpl
construct:
$ cat examples/size_implementations_basic_types.ex defimpl Size, for: BitString do def size(string), do: byte_size(string) end defimpl Size, for: Map do def size(map), do: map_size(map) end defimpl Size, for: Tuple do def size(tuple), do: tuple_size(tuple) end
Note
We didn't define an implementation for the lists, as in Elixir, size is usually used for data structures that have their size precomputed. For types where we have to compute this on demand, such as lists, the length term is used instead of size. This is further observable by looking at the name of the function used to get the dimension of a list: Kernel.length/1
.
With this defined, we can see our protocol in action:
iex> Size.size("a string") 8 iex> Size.size(%{a: "b", c: "d"}) 2 iex> Size.size({1, 2, 3}) 3
If we try to use our protocol on a type that doesn't have an implementation defined, an error is raised:
iex> Size.size([1, 2, 3, 4]) ** (Protocol.UndefinedError) protocol Size not implemented for [1, 2, 3, 4]
Note
You can define an implementation for a protocol on all Elixir data types: Atom
, BitString
, Float
, Function
, Integer
, Tuple
, List
, Map
, PID
, Port
, and Reference
. Note that BitString
is used for the binary type as well.
Having to implement a protocol for all types may quickly become monotonous and exhausting. You can define a fallback behavior for types that don't implement your protocol by implementing the protocol for Any
. Let's do this for our Size
protocol:
$ cat examples/size_implementation_any.ex defimpl Size, for: Any do def size(_), do: 0 end
You have to define the desired behavior when a type doesn't implement your protocol. In this case, we're saying that it has a size of 0 (which might not make sense, since the data type may have a size different than 0, but let's ignore that detail).
We now have two options for this implementation to be used: Either mark the modules where we want this fallback behavior with @derive [Size]
(the List
module, for instance), or use @fallback_to_any true
in the definition of our Size
protocol. The former is more laborious as you have to annotate each module that you want to assume the behavior for Any
, while the latter is simpler since you make it work on all data types just by changing the definition of your protocol. In the Elixir community, explicitness is usually preferred, and, as such, you're more likely to see the @derive
approach in Elixir projects.
While implementing protocols for Elixir's data types already opens a world of possibilities, we can only fully utilize Elixir's extensibility when we mix them with structs. We haven't yet talked about structs, so we'll introduce them in the next section.
Structs are an abstraction built on top of maps. We define a struct inside a module, with the defstruct
construct. The struct's name is the name of the module it's being defined in (which means you can only define one struct per module). To defstruct
, we pass a keyword list, which contains the key-value pairs that define the fields that struct has, along with their default values. Let's define a Folder
struct:
$ cat examples/folder.ex defmodule Folder do defstruct name: "new folder", files_info: [], path: nil end
We can now use it in our IEx session:
iex> %Folder{} %Folder{files_info: [], name: "new folder", path: nil} iex> %Folder{}.name "new folder" iex> %Folder{}.files_info []
Elixir already has a File
module, which provides several functions to deal with files. One of them is the File.stat/2
, which returns a %File.Stat{}
struct with information about the provided path. The files_info
field in our %Folder{}
struct is a list, which will contain %File.Stat{}
structs as elements. Let's initialize a folder with one file:
iex> folder = %Folder{files_info: [File.stat!("string_helper.ex")]} %Folder{files_info: [%File.Stat{access: :read_write, atime: {{2017, 12, 31}, {16, 58, 56}}, ctime: {{2017, 12, 30}, {3, 40, 29}}, gid: 100, inode: 3290229, links: 1, major_device: 65024, minor_device: 0, mode: 33188, mtime: {{2017, 12, 30}, {3, 40, 29}}, size: 509, type: :regular, uid: 1000}], name: "new folder", path: nil}
Note that this example assumes you have a "string_helper.ex"
file in the directory where you started iex
. Also note that we're using File.stat!
, which works similarly to File.stat
, but, instead of returning a {:ok, result}
tuple, it returns the result itself.
We now have our %Folder{}
struct with one file. We can now show you the syntax to update a struct, which is similar to the one used in maps (or you can use the functions from the Map
module). Assuming you also have a "recursion.ex"
file on your current working directory, you can use this syntax to update the struct:
iex> folder = %Folder{ folder | files_info: [File.stat!("recursion.ex") | folder.files_info]} %Folder{files_info: [%File.Stat{access: :read_write, atime: {{2017, 12, 30}, {20, 8, 29}}, ctime: {{2017, 12, 30}, {20, 8, 25}}, gid: 100, inode: 3278529, links: 1, major_device: 65024, minor_device: 0, mode: 33188, mtime: {{2017, 12, 30}, {20, 8, 25}}, size: 270, type: :regular, uid: 1000}, %File.Stat{access: :read_write, atime: {{2017, 12, 31}, {16, 58, 56}}, ctime: {{2017, 12, 30}, {3, 40, 29}}, gid: 100, inode: 3290229, links: 1, major_device: 65024, minor_device: 0, mode: 33188, mtime: {{2017, 12, 30}, {3, 40, 29}}, size: 509, type: :regular, uid: 1000}], name: "new folder", path: nil} iex> folder.files_info [%File.Stat{access: :read_write, atime: {{2017, 12, 30}, {20, 8, 29}}, ctime: {{2017, 12, 30}, {20, 8, 25}}, gid: 100, inode: 3278529, links: 1, major_device: 65024, minor_device: 0, mode: 33188, mtime: {{2017, 12, 30}, {20, 8, 25}}, size: 270, type: :regular, uid: 1000}, %File.Stat{access: :read_write, atime: {{2017, 12, 31}, {16, 58, 56}}, ctime: {{2017, 12, 30}, {3, 40, 29}}, gid: 100, inode: 3290229, links: 1, major_device: 65024, minor_device: 0, mode: 33188, mtime: {{2017, 12, 30}, {3, 40, 29}}, size: 509, type: :regular, uid: 1000}]
As you can see, we now have two files in our %Folder{}
struct.
Note
Although structs are implemented on top of maps, they do not share protocol implementations with the Map
module. This means that you can't, out of the box, iterate on a struct, as it doesn't implement the Enumerable
protocol.
We'll end our little tour of structs with two more bits of information. First, if you don't provide a default value when defining the fields of a struct, nil
will be assumed as its default value. Second, you can enforce that certain fields are required when creating your struct. You do that with the @enforce_keys
module attribute. If we wanted to make sure path
was provided when creating our %Folder{}
struct, we would define it as following:
$ cat examples/folder_with_enforce_keys.ex defmodule Folder do @enforce_keys :path defstruct name: "new folder", files_info: [], path: nil end
If you don't provide path
when creating this struct, ArgumentError
will be raised:
iex> %Folder{} ** (ArgumentError) the following keys must also be given when building struct Folder: [:path] expanding struct: Folder.__struct__/1 iex:46: (file) iex> %Folder{path: "/a/b/c/"} %Folder{files_info: [], name: "new folder", path: "/a/b/c/"}
Now that we have the %Folder{}
struct defined, we can define its implementation for the Size
protocol.
We'll first define the implementation for the %File.Stat{}
struct, as we can then use this to implement the protocol for %Folder{}
. Here's the implementation for %File.Stat{}
:
$ cat examples/size_implementations_file_stat_and_folder.ex defimpl Size, for: File.Stat do def size(file_stat), do: file_stat.size end # ...
With this in place, our implementation for our %Folder{}
struct is as follows:
$ cat examples/size_implementations_file_stat_and_folder.ex # ... defimpl Size, for: Folder do def size(folder) do folder.files_info |> Enum.map(&Size.size(&1)) |> Enum.sum() end end
To find out the size of a folder, we sum the size of each file it contains. As such, this implementation iterates through our files_info
list, using the Size
implementation for %File.Stat{}
to get the size of each file, summing all the sizes in the end. In the following snippet, we can see this implementation being used on the folder
variable we just defined:
iex> Size.size(folder) 779
With this, we can see the full power of mixing structs and protocols, which lets us have polymorphic functions based on the data type of their arguments. We now have a common interface, Size.size(data)
, that allows us to find out the size of pretty much anything we want, provided that we implement the Size
protocol for the data type we're interested in.