Book Image

The Clojure Workshop

By : Joseph Fahey, Thomas Haratyk, Scott McCaughie, Yehonathan Sharvit, Konrad Szydlo
Book Image

The Clojure Workshop

By: Joseph Fahey, Thomas Haratyk, Scott McCaughie, Yehonathan Sharvit, Konrad Szydlo

Overview of this book

The Clojure Workshop is a step-by-step guide to Clojure and ClojureScript, designed to quickly get you up and running as a confident, knowledgeable developer. Because of the functional nature of the language, Clojure programming is quite different to what many developers will have experienced. As hosted languages, Clojure and ClojureScript can also be daunting for newcomers because of complexities in the tooling and the challenge of interacting with the host platforms. To help you overcome these barriers, this book adopts a practical approach. Every chapter is centered around building something. As you progress through the book, you will progressively develop the 'muscle memory' that will make you a productive Clojure programmer, and help you see the world through the concepts of functional programming. You will also gain familiarity with common idioms and patterns, as well as exposure to some of the most widely used libraries. Unlike many Clojure books, this Workshop will include significant coverage of both Clojure and ClojureScript. This makes it useful no matter your goal or preferred platform, and provides a fresh perspective on the hosted nature of the language. By the end of this book, you'll have the knowledge, skills and confidence to creatively tackle your own ambitious projects with Clojure and ClojureScript.
Table of Contents (17 chapters)
Free Chapter
2
2. Data Types and Immutability

Simple Data Types

A data type designates what kind of value a piece of data holds; it is a fundamental way of classifying data. Different types allow different kinds of operations: we can concatenate strings, multiply numbers, and perform logic algebra operations with Booleans. Because Clojure has a strong emphasis on practicality, we don't explicitly assign types to values in Clojure, but those values still have a type.

Clojure is a hosted language and has three notable, major implementations in Java, JavaScript, and .NET. Being a hosted language is a useful trait that allows Clojure programs to run in different environments and take advantage of the ecosystem of its host. Regarding data types, it means that each implementation has different underlying data types, but don't worry as those are just implementation details. As a Clojure programmer, it does not make much difference, and if you know how to do something in Clojure, you likely know how to do it in, say, ClojureScript.

In this topic, we will go through Clojure's simple data types. Here is the list of the data types looked at in this section. Please note that the following types are all immutable:

  • Strings
  • Numbers
  • Booleans
  • Keywords
  • Nil

Strings

Strings are sequences of characters representing text. We have been using and manipulating strings since the first exercise of Chapter 1, Hello REPL.

You can create a string by simply wrapping characters with double quotes ("):

user=> "I am a String"
"I am a String"
user=> "I am immutable"
"I am immutable"

String literals are only created with double quotes, and if you need to use double quotes in a string, you can escape them with the backslash character (\):

user=> (println "\"The measure of intelligence is the ability to change\" - Albert Einstein")
"The measure of intelligence is the ability to change" - Albert Einstein
nil

Strings are not able to be changed; they are immutable. Any function that claims to transform a string yields a new value:

user=> (def silly-string "I am Immutable. I am a silly String")
#'user/silly-string
user=> (clojure.string/replace silly-string "silly" "clever")
"I am Immutable. I am a clever String"
user=> silly-string
"I am Immutable. I am a silly String"

In the preceding example, calling clojure.string/replace on silly-string returned a new string with the word "silly" replaced with "clever." However, when evaluating silly-string again, we can see that the value has not changed. The function returned a different value and did not change the original string.

Although a string is usually a single unit of data representing text, Strings are also collections of characters. In the JVM implementation of Clojure, strings are of the java.lang.String Java type and they are implemented as collections of the java.lang.Character Java type, such as the following command, which returns a character:

user=> (first "a collection of characters")
\a
user=> (type *1)
java.lang.Character

first returns the first element of a collection. Here, the literal notation of a character is \a. The type function returns a string representation of the data type for a given value. Remember that we can use *1 to retrieve the last returned value in the REPL, so *1 evaluates to \a.

It is interesting to note that, in ClojureScript, strings are collections of one-character strings, because there is no character type in JavaScript. Here is a similar example in a ClojureScript REPL:

cljs.user=> (last "a collection of 1 character strings")
"s"
cljs.user=> (type *1)
#object[String]

As with the Clojure REPL, type returns a string representation of the data type. This time, in ClojureScript, the value returned by the last function (which returns the last character of a string) is of the #object[String] type, which means a JavaScript string.

You can find a few common functions for manipulating strings in the core namespace, such as str, which we used in Chapter 1, Hello REPL!, to concatenate (combine multiple strings together into one string):

user=> (str "That's the way you " "con" "ca" "te" "nate")
"That's the way you concatenate"
user=> (str *1 " - " silly-string)
"That's the way you concatenate - I am Immutable. I am a silly String"

Most functions for manipulating strings can be found in the clojure.string namespace. Here is a list of them using the REPL dir function:

user=> (dir clojure.string)
blank?
capitalize
ends-with?
escape
includes?
index-of
join
last-index-of
lower-case
re-quote-replacement
replace
replace-first
reverse
split
split-lines
starts-with?
trim
trim-newline
triml
trimr
upper-case

As a reminder, this is how you can use a function from a specific namespace:

user=> (clojure.string/includes? "potatoes" "toes")
true

We will not cover all the string functions, but feel free to try them out now. You can always look up the documentation of a string function from the preceding list with the doc function.

Numbers

Clojure has good support for numbers and you will most likely not have to worry about the underlying types, as Clojure will handle pretty much anything. However, it is important to note that there are a few differences between Clojure and ClojureScript in that regard.

In Clojure, by default, natural numbers are implemented as the java.lang.Long Java type unless the number is too big for Long. In that case, it is typed clojure.lang.BigInt:

user=> (type 1)
java.lang.Long
user=> (type 1000000000000000000)
java.lang.Long
user=> (type 10000000000000000000)
clojure.lang.BigInt

Notice, in the preceding example, that the number was too big to fit in the java.lang.Long Java type and, therefore, was implicitly typed clojure.lang.BigInt.

Exact ratios are represented by Clojure as "Ratio" types, which have a literal representation. 5/4 is not an exact ratio, so the output is the ratio itself:

user=> 5/4
5/4

The result of dividing 3 by 4 can be represented by the ratio 3/4:

user=> (/ 3 4)
3/4
user=> (type 3/4)
clojure.lang.Ratio

4/4 is equivalent to 1 and is evaluated as follows:

user=> 4/4
1

Decimal numbers are "double" precision floating-point numbers:

user=> 1.2
1.2

If we take our division of 3 by 4 again, but this time mix in a "Double" type, we will not get a ratio as a result:

user=> (/ 3 4.0)
0.75

This is because floating-point numbers are "contagious" in Clojure. Any operation involving floating-point numbers will result in a float or a double:

user=> (* 1.0 2)
2.0
user=> (type (* 1.0 2))
java.lang.Double

In ClojureScript, however, numbers are just "JavaScript numbers," which are all double-precision floating-point numbers. JavaScript does not define different types of numbers like Java and some other programming languages do (for example, long, integer, and short):

cljs.user=> 1
1
cljs.user=> 1.2
1.2
cljs.user=> (/ 3 4)
0.75
cljs.user=> 3/4
0.75
cljs.user=> (* 1.0 2)
2

Notice that, this time, any operation returns a floating-point number. The fact that there is no decimal separation for 1 or 2 is just a formatting convenience.

We can make sure that all those numbers are JavaScript numbers (double-precision, floating-point) by using the type function:

cljs.user=> (type 1)
#object[Number]
cljs.user=> (type 1.2)
#object[Number]
cljs.user=> (type 3/4)
#object[Number]

If you need to do more than simple arithmetic, you can use the Java or JavaScript math libraries, which are similar except for a few minor exceptions.

You will learn more about host platform interoperability in Chapter 9, Host Platform Interoperability with Java and JavaScript (how to interact with the host platform and its ecosystem), but the examples in the chapter will get you started with doing some more complicated math and with using the math library:

Reading a value from a constant can be done like this:

user=> Math/PI
3.141592653589793

And calling a function, like the usual Clojure functions, can be done like this:

user=> (Math/random)
0.25127992428738254
user=> (Math/sqrt 9)
3.0
user=> (Math/round 0.7)
1

Exercise 2.01: The Obfuscation Machine

You have been contacted by a secret government agency to develop an algorithm that encodes text into a secret string that only the owner of the algorithm can decode. Apparently, they don't trust other security mechanisms such as SSL and will only communicate sensitive information with their own proprietary technology.

You need to develop an encode function and a decode function. The encode function should replace letters with numbers that are not easily guessable. For that purpose, each letter will take the character's number value in the ASCII table, add another number to it (the number of words in the sentence to encode), and finally, compute the square value of that number. The decode function should allow the user to revert to the original string. Someone highly ranked in the agency came up with that algorithm so they trust it to be very secure.

In this exercise, we will put into practice some of the things we've learned about strings and numbers by building an obfuscation machine:

  1. Start your REPL and look up the documentation of the clojure.string/replace function:
    user=> (doc clojure.string/replace)
    -------------------------
    clojure.string/replace
    ([s match replacement])
      Replaces all instance of match with replacement in s.
       match/replacement can be:
       string / string
       char / char
       pattern / (string or function of match).
       See also replace-first.
       The replacement is literal (i.e. none of its characters are treated
       specially) for all cases above except pattern / string.
       For pattern / string, $1, $2, etc. in the replacement string are
       substituted with the string that matched the corresponding
       parenthesized group in the pattern.  If you wish your replacement
       string r to be used literally, use (re-quote-replacement r) as the
       replacement argument.  See also documentation for
       java.util.regex.Matcher's appendReplacement method.
       Example:
       (clojure.string/replace "Almost Pig Latin" #"\b(\w)(\w+)\b" "$2$1ay")
       -> "lmostAay igPay atinLay"

    Notice that the replace function can take a pattern and a function of the matching result as parameters. We don't know how to iterate over collections yet, but using the replace function with a pattern and a "replacement function" should do the job.

  2. Try and use the replace function with the #"\w" pattern (which means word character), replace it with the ! character, and observe the result:
    user=> (clojure.string/replace "Hello World" #"\w" "!")

    The output is as follows:

    "!!!!! !!!!!"
  3. Try and use the replace function with the same pattern, but this time passing an anonymous function that takes the matching letter as a parameter:
    user=> (clojure.string/replace "Hello World" #"\w" (fn [letter] (do (println letter) "!")))

    The output is as follows:

    H
    e
    l
    l
    o
    W
    o
    r
    l
    d
    "!!!!! !!!!!"

    Observe that the function was called for each letter, printing the match out to the console and finally returning the string with the matches replaced by the ! character. It looks like we should be able to write our encoding logic in that replacement function.

  4. Let's now see how we can convert a character to a number. We can use the int function, which coerces its parameter to an integer. It can be used like this:
    user=> (int \a)
    97
  5. It seems that the "replacement function" will take a string as a parameter, so let's convert our string to a character. Use the char-array function combined with first to convert our string to a character as follows:
    user=> (first (char-array "a"))
    \a
  6. Now, if we combine previous steps together and also compute the square value of the character's number, we should be approaching our obfuscation goal. Combine the code written previously to obtain a character code from a string and get its square value using the Math/pow function as follows:
    user=> (Math/pow (int (first (char-array "a"))) 2)
    9409.0
  7. Let's now convert this result to the string that will be returned from our replace function. First, let's remove the decimal part by coercing the result to an int, and put things together in an encode-letter function, as follows:
    user=>
    (defn encode-letter
      [s]
      (let [code (Math/pow (int (first (char-array s))) 2)]
        (str (int code))))
    #'user/encode-letter
    user=> (encode-letter "a")
    "9409"

    Great! It seems to work. Let's now test our function as part of the replace function.

  8. Create the encode function, which uses clojure.string/replace as well as our encode-letter function:
    user=>
    (defn encode
      [s]
      (clojure.string/replace s #"\w" encode-letter))
    #'user/encode
    user=> (encode "Hello World")
    "518410201116641166412321 756912321129961166410000"

    It seems to work but the resulting string will be hard to decode without being able to identify each letter individually.

    There is another thing that we did not take into account: the encode function should take an arbitrary number to add to the code before calculating the square value.

  9. First, add a separator as part of our encode-letter function, for example, the # character, so that we can identify each letter individually. Second, add an extra parameter to encode-letter, which needs to be added before calculating the square value:
    user=>
    (defn encode-letter
      [s x]
      (let [code (Math/pow (+ x (int (first (char-array s)))) 2)]
        (str "#" (int code))))
    #'user/encode-letter
  10. Now, test the encode function another time:
    user=> (encode "Hello World")
    Execution error (ArityException) at user/encode (REPL:3).
    Wrong number of args (1) passed to: user/encode-letter

    Our encode function is now failing because it is expecting an extra argument.

  11. Modify the encode function to calculate the number of words in the text to obfuscate, and pass it to the encode-letter function. You can use the clojure.string/split function with a whitespace, as follows, to count the number of words:
    user=>
    (defn encode
      [s]
      (let [number-of-words (count (clojure.string/split s #" "))]
        (clojure.string/replace s #"\w" (fn [s] (encode-letter s number-of-words)))))
    #'user/encode
  12. Try your newly created function with a few examples and make sure it obfuscates strings properly:
    user=> (encode "Super secret")
    "#7225#14161#12996#10609#13456 #13689#10609#10201#13456#10609#13924"
    user=> (encode "Super secret message")
    "#7396#14400#13225#10816#13689 #13924#10816#10404#13689#10816#14161 #12544#10816#13924#13924#10000#11236#10816"

    What a beautiful, unintelligible, obfuscated string – well done! Notice how the numbers for the same letters are different depending on the number of words in the phrase to encode. It seems to work according to the specification!

    We can now start working on the decode function, for which we will need to use the following functions:

    Math/sqrt to obtain the square root value of a number.

    char to retrieve a letter from a character code (a number).

    subs as in substring, to get a sub-portion of a string (and get rid of our # separator).

    Integer/parseInt to convert a string to an integer.

  13. Write the decode function using a combination of the preceding functions, to decode an obfuscated character:
    user=>
    (defn decode-letter
      [x y]
      (let [number (Integer/parseInt (subs x 1))
            letter (char (- (Math/sqrt number) y))]
      (str letter)))
    #'user/decode-letter
  14. Finally, write the decode function, which is similar to the encode function except that it should use decode-letter instead of encode-letter:
    user=>
    (defn decode [s]
      (let [number-of-words (count (clojure.string/split s #" "))]
        (clojure.string/replace s #"\#\d+" (fn [s] (decode-letter s number-of-words)))))
    #'user/decode
  15. Test your functions and make sure that they both work:
    user=> (encode "If you want to keep a secret, you must also hide it from yourself.")

    The output is as follows:

    "#7569#13456 #18225#15625#17161 #17689#12321#15376#16900 #16900#15625 #14641#13225#13225#15876 #12321 #16641#13225#12769#16384#13225#16900, #18225#15625#17161 #15129#17161#16641#16900 #12321#14884#16641#15625 #13924#14161#12996#13225 #14161#16900 #13456#16384#15625#15129 #18225#15625#17161#16384#16641#13225#14884#13456."
    user=> (decode *1)
    "If you want to keep a secret, you must also hide it from yourself."

In this exercise, we've put into practice working with numbers and strings by creating an encoding system. We can now move on to learning other data types, starting with Booleans.

Booleans

Booleans are implemented as Java's java.lang.Boolean in Clojure or JavaScript's "Boolean" in ClojureScript. Their value can either be true or false, and their literal notations are simply the lowercase true and false.

Symbols

Symbols are identifiers referring to something else. We have already been using symbols when creating bindings or calling functions. For example, when using def, the first argument is a symbol that will refer to a value, and when calling a function such as +, + is a symbol referring to the function implementing the addition. Consider the following examples:

user=> (def foo "bar")
#'user/foo
user=> foo
"bar"
user=> (defn add-2 [x] (+ x 2))
#'user/add-2
user=> add-2
#object[user$add_2 0x4e858e0a "user$add_2@4e858e0a"]

Here, we have created the user/foo symbol, which refers to the "bar" string, and the add-2 symbol, which refers to the function that adds 2 to its parameter. We have created those symbols in the user namespace, hence the notation with /: user/foo.

If we try to evaluate a symbol that has not been defined, we'll get an error:

user=> marmalade
Syntax error compiling at (REPL:0:0).
Unable to resolve symbol: marmalade in this context

In the REPL Basics topic of Chapter 1, Hello REPL!, we were able to use the following functions because they are bound to a specific symbol:

user=> str
#object[clojure.core$str 0x7bb6ab3a "clojure.core$str@7bb6ab3a"]
user=> +
#object[clojure.core$_PLUS_ 0x1c3146bc "clojure.core$_PLUS_@1c3146bc"]
user=> clojure.string/replace
#object[clojure.string$replace 0xf478a81 "clojure.string$replace@f478a81"]

Those gibberish-like values are string representations of the functions, because we are asking for the values bound to the symbols rather than invoking the functions (wrapping them with parentheses).

Keywords

You can think of a keyword as some kind of a special constant string. Keywords are a nice addition to Clojure because they are lightweight and convenient to use and create. You just need to use the colon character, :, at the beginning of a word to create a keyword:

user=> :foo
:foo
user=> :another_keyword
:another_keyword

They don't refer to anything else like symbols do; as you can see in the preceding example, when evaluated, they just return themselves. Keywords are typically used as keys in a key-value associative map, as we will see in the next topic about collections.

In this section, we went through simple data types such as string, numbers, Boolean, symbols, and keywords. We highlighted how their underlying implementation depends on the host platform because Clojure is a hosted language. In the next section, we will see how those values can aggregate to collections.