-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
The Clojure Workshop
By :
A data type designates what kind of value a piece of data holds; it is a fundamental way of classifying data. Different types allow different kinds of operations: we can concatenate strings, multiply numbers, and perform logic algebra operations with Booleans. Because Clojure has a strong emphasis on practicality, we don't explicitly assign types to values in Clojure, but those values still have a type.
Clojure is a hosted language and has three notable, major implementations in Java, JavaScript, and .NET. Being a hosted language is a useful trait that allows Clojure programs to run in different environments and take advantage of the ecosystem of its host. Regarding data types, it means that each implementation has different underlying data types, but don't worry as those are just implementation details. As a Clojure programmer, it does not make much difference, and if you know how to do something in Clojure, you likely know how to do it in, say, ClojureScript.
In this topic, we will go through Clojure's simple data types. Here is the list of the data types looked at in this section. Please note that the following types are all immutable:
Strings are sequences of characters representing text. We have been using and manipulating strings since the first exercise of Chapter 1, Hello REPL.
You can create a string by simply wrapping characters with double quotes ("):
user=> "I am a String" "I am a String" user=> "I am immutable" "I am immutable"
String literals are only created with double quotes, and if you need to use double quotes in a string, you can escape them with the backslash character (\):
user=> (println "\"The measure of intelligence is the ability to change\" - Albert Einstein") "The measure of intelligence is the ability to change" - Albert Einstein nil
Strings are not able to be changed; they are immutable. Any function that claims to transform a string yields a new value:
user=> (def silly-string "I am Immutable. I am a silly String") #'user/silly-string user=> (clojure.string/replace silly-string "silly" "clever") "I am Immutable. I am a clever String" user=> silly-string "I am Immutable. I am a silly String"
In the preceding example, calling clojure.string/replace on silly-string returned a new string with the word "silly" replaced with "clever." However, when evaluating silly-string again, we can see that the value has not changed. The function returned a different value and did not change the original string.
Although a string is usually a single unit of data representing text, Strings are also collections of characters. In the JVM implementation of Clojure, strings are of the java.lang.String Java type and they are implemented as collections of the java.lang.Character Java type, such as the following command, which returns a character:
user=> (first "a collection of characters") \a user=> (type *1) java.lang.Character
first returns the first element of a collection. Here, the literal notation of a character is \a. The type function returns a string representation of the data type for a given value. Remember that we can use *1 to retrieve the last returned value in the REPL, so *1 evaluates to \a.
It is interesting to note that, in ClojureScript, strings are collections of one-character strings, because there is no character type in JavaScript. Here is a similar example in a ClojureScript REPL:
cljs.user=> (last "a collection of 1 character strings") "s" cljs.user=> (type *1) #object[String]
As with the Clojure REPL, type returns a string representation of the data type. This time, in ClojureScript, the value returned by the last function (which returns the last character of a string) is of the #object[String] type, which means a JavaScript string.
You can find a few common functions for manipulating strings in the core namespace, such as str, which we used in Chapter 1, Hello REPL!, to concatenate (combine multiple strings together into one string):
user=> (str "That's the way you " "con" "ca" "te" "nate") "That's the way you concatenate" user=> (str *1 " - " silly-string) "That's the way you concatenate - I am Immutable. I am a silly String"
Most functions for manipulating strings can be found in the clojure.string namespace. Here is a list of them using the REPL dir function:
user=> (dir clojure.string) blank? capitalize ends-with? escape includes? index-of join last-index-of lower-case re-quote-replacement replace replace-first reverse split split-lines starts-with? trim trim-newline triml trimr upper-case
As a reminder, this is how you can use a function from a specific namespace:
user=> (clojure.string/includes? "potatoes" "toes") true
We will not cover all the string functions, but feel free to try them out now. You can always look up the documentation of a string function from the preceding list with the doc function.
Clojure has good support for numbers and you will most likely not have to worry about the underlying types, as Clojure will handle pretty much anything. However, it is important to note that there are a few differences between Clojure and ClojureScript in that regard.
In Clojure, by default, natural numbers are implemented as the java.lang.Long Java type unless the number is too big for Long. In that case, it is typed clojure.lang.BigInt:
user=> (type 1) java.lang.Long user=> (type 1000000000000000000) java.lang.Long user=> (type 10000000000000000000) clojure.lang.BigInt
Notice, in the preceding example, that the number was too big to fit in the java.lang.Long Java type and, therefore, was implicitly typed clojure.lang.BigInt.
Exact ratios are represented by Clojure as "Ratio" types, which have a literal representation. 5/4 is not an exact ratio, so the output is the ratio itself:
user=> 5/4 5/4
The result of dividing 3 by 4 can be represented by the ratio 3/4:
user=> (/ 3 4) 3/4 user=> (type 3/4) clojure.lang.Ratio
4/4 is equivalent to 1 and is evaluated as follows:
user=> 4/4 1
Decimal numbers are "double" precision floating-point numbers:
user=> 1.2 1.2
If we take our division of 3 by 4 again, but this time mix in a "Double" type, we will not get a ratio as a result:
user=> (/ 3 4.0) 0.75
This is because floating-point numbers are "contagious" in Clojure. Any operation involving floating-point numbers will result in a float or a double:
user=> (* 1.0 2) 2.0 user=> (type (* 1.0 2)) java.lang.Double
In ClojureScript, however, numbers are just "JavaScript numbers," which are all double-precision floating-point numbers. JavaScript does not define different types of numbers like Java and some other programming languages do (for example, long, integer, and short):
cljs.user=> 1 1 cljs.user=> 1.2 1.2 cljs.user=> (/ 3 4) 0.75 cljs.user=> 3/4 0.75 cljs.user=> (* 1.0 2) 2
Notice that, this time, any operation returns a floating-point number. The fact that there is no decimal separation for 1 or 2 is just a formatting convenience.
We can make sure that all those numbers are JavaScript numbers (double-precision, floating-point) by using the type function:
cljs.user=> (type 1) #object[Number] cljs.user=> (type 1.2) #object[Number] cljs.user=> (type 3/4) #object[Number]
If you need to do more than simple arithmetic, you can use the Java or JavaScript math libraries, which are similar except for a few minor exceptions.
You will learn more about host platform interoperability in Chapter 9, Host Platform Interoperability with Java and JavaScript (how to interact with the host platform and its ecosystem), but the examples in the chapter will get you started with doing some more complicated math and with using the math library:
Reading a value from a constant can be done like this:
user=> Math/PI 3.141592653589793
And calling a function, like the usual Clojure functions, can be done like this:
user=> (Math/random) 0.25127992428738254 user=> (Math/sqrt 9) 3.0 user=> (Math/round 0.7) 1
You have been contacted by a secret government agency to develop an algorithm that encodes text into a secret string that only the owner of the algorithm can decode. Apparently, they don't trust other security mechanisms such as SSL and will only communicate sensitive information with their own proprietary technology.
You need to develop an encode function and a decode function. The encode function should replace letters with numbers that are not easily guessable. For that purpose, each letter will take the character's number value in the ASCII table, add another number to it (the number of words in the sentence to encode), and finally, compute the square value of that number. The decode function should allow the user to revert to the original string. Someone highly ranked in the agency came up with that algorithm so they trust it to be very secure.
In this exercise, we will put into practice some of the things we've learned about strings and numbers by building an obfuscation machine:
clojure.string/replace function:user=> (doc clojure.string/replace) ------------------------- clojure.string/replace ([s match replacement]) Replaces all instance of match with replacement in s. match/replacement can be: string / string char / char pattern / (string or function of match). See also replace-first. The replacement is literal (i.e. none of its characters are treated specially) for all cases above except pattern / string. For pattern / string, $1, $2, etc. in the replacement string are substituted with the string that matched the corresponding parenthesized group in the pattern. If you wish your replacement string r to be used literally, use (re-quote-replacement r) as the replacement argument. See also documentation for java.util.regex.Matcher's appendReplacement method. Example: (clojure.string/replace "Almost Pig Latin" #"\b(\w)(\w+)\b" "$2$1ay") -> "lmostAay igPay atinLay"
Notice that the replace function can take a pattern and a function of the matching result as parameters. We don't know how to iterate over collections yet, but using the replace function with a pattern and a "replacement function" should do the job.
replace function with the #"\w" pattern (which means word character), replace it with the ! character, and observe the result:user=> (clojure.string/replace "Hello World" #"\w" "!")
The output is as follows:
"!!!!! !!!!!"
replace function with the same pattern, but this time passing an anonymous function that takes the matching letter as a parameter:user=> (clojure.string/replace "Hello World" #"\w" (fn [letter] (do (println letter) "!")))
The output is as follows:
H e l l o W o r l d "!!!!! !!!!!"
Observe that the function was called for each letter, printing the match out to the console and finally returning the string with the matches replaced by the ! character. It looks like we should be able to write our encoding logic in that replacement function.
int function, which coerces its parameter to an integer. It can be used like this:user=> (int \a) 97
char-array function combined with first to convert our string to a character as follows:user=> (first (char-array "a")) \a
Math/pow function as follows:user=> (Math/pow (int (first (char-array "a"))) 2) 9409.0
replace function. First, let's remove the decimal part by coercing the result to an int, and put things together in an encode-letter function, as follows:user=> (defn encode-letter [s] (let [code (Math/pow (int (first (char-array s))) 2)] (str (int code)))) #'user/encode-letter user=> (encode-letter "a") "9409"
Great! It seems to work. Let's now test our function as part of the replace function.
encode function, which uses clojure.string/replace as well as our encode-letter function:user=> (defn encode [s] (clojure.string/replace s #"\w" encode-letter)) #'user/encode user=> (encode "Hello World") "518410201116641166412321 756912321129961166410000"
It seems to work but the resulting string will be hard to decode without being able to identify each letter individually.
There is another thing that we did not take into account: the encode function should take an arbitrary number to add to the code before calculating the square value.
encode-letter function, for example, the # character, so that we can identify each letter individually. Second, add an extra parameter to encode-letter, which needs to be added before calculating the square value:user=> (defn encode-letter [s x] (let [code (Math/pow (+ x (int (first (char-array s)))) 2)] (str "#" (int code)))) #'user/encode-letter
encode function another time:user=> (encode "Hello World") Execution error (ArityException) at user/encode (REPL:3). Wrong number of args (1) passed to: user/encode-letter
Our encode function is now failing because it is expecting an extra argument.
encode function to calculate the number of words in the text to obfuscate, and pass it to the encode-letter function. You can use the clojure.string/split function with a whitespace, as follows, to count the number of words:user=> (defn encode [s] (let [number-of-words (count (clojure.string/split s #" "))] (clojure.string/replace s #"\w" (fn [s] (encode-letter s number-of-words))))) #'user/encode
user=> (encode "Super secret") "#7225#14161#12996#10609#13456 #13689#10609#10201#13456#10609#13924" user=> (encode "Super secret message") "#7396#14400#13225#10816#13689 #13924#10816#10404#13689#10816#14161 #12544#10816#13924#13924#10000#11236#10816"
What a beautiful, unintelligible, obfuscated string – well done! Notice how the numbers for the same letters are different depending on the number of words in the phrase to encode. It seems to work according to the specification!
We can now start working on the decode function, for which we will need to use the following functions:
Math/sqrt to obtain the square root value of a number.
char to retrieve a letter from a character code (a number).
subs as in substring, to get a sub-portion of a string (and get rid of our # separator).
Integer/parseInt to convert a string to an integer.
decode function using a combination of the preceding functions, to decode an obfuscated character:user=> (defn decode-letter [x y] (let [number (Integer/parseInt (subs x 1)) letter (char (- (Math/sqrt number) y))] (str letter))) #'user/decode-letter
decode function, which is similar to the encode function except that it should use decode-letter instead of encode-letter:user=> (defn decode [s] (let [number-of-words (count (clojure.string/split s #" "))] (clojure.string/replace s #"\#\d+" (fn [s] (decode-letter s number-of-words))))) #'user/decode
user=> (encode "If you want to keep a secret, you must also hide it from yourself.")
The output is as follows:
"#7569#13456 #18225#15625#17161 #17689#12321#15376#16900 #16900#15625 #14641#13225#13225#15876 #12321 #16641#13225#12769#16384#13225#16900, #18225#15625#17161 #15129#17161#16641#16900 #12321#14884#16641#15625 #13924#14161#12996#13225 #14161#16900 #13456#16384#15625#15129 #18225#15625#17161#16384#16641#13225#14884#13456." user=> (decode *1) "If you want to keep a secret, you must also hide it from yourself."
In this exercise, we've put into practice working with numbers and strings by creating an encoding system. We can now move on to learning other data types, starting with Booleans.
Booleans are implemented as Java's java.lang.Boolean in Clojure or JavaScript's "Boolean" in ClojureScript. Their value can either be true or false, and their literal notations are simply the lowercase true and false.
Symbols are identifiers referring to something else. We have already been using symbols when creating bindings or calling functions. For example, when using def, the first argument is a symbol that will refer to a value, and when calling a function such as +, + is a symbol referring to the function implementing the addition. Consider the following examples:
user=> (def foo "bar") #'user/foo user=> foo "bar" user=> (defn add-2 [x] (+ x 2)) #'user/add-2 user=> add-2 #object[user$add_2 0x4e858e0a "user$add_2@4e858e0a"]
Here, we have created the user/foo symbol, which refers to the "bar" string, and the add-2 symbol, which refers to the function that adds 2 to its parameter. We have created those symbols in the user namespace, hence the notation with /: user/foo.
If we try to evaluate a symbol that has not been defined, we'll get an error:
user=> marmalade Syntax error compiling at (REPL:0:0). Unable to resolve symbol: marmalade in this context
In the REPL Basics topic of Chapter 1, Hello REPL!, we were able to use the following functions because they are bound to a specific symbol:
user=> str #object[clojure.core$str 0x7bb6ab3a "clojure.core$str@7bb6ab3a"] user=> + #object[clojure.core$_PLUS_ 0x1c3146bc "clojure.core$_PLUS_@1c3146bc"] user=> clojure.string/replace #object[clojure.string$replace 0xf478a81 "clojure.string$replace@f478a81"]
Those gibberish-like values are string representations of the functions, because we are asking for the values bound to the symbols rather than invoking the functions (wrapping them with parentheses).
You can think of a keyword as some kind of a special constant string. Keywords are a nice addition to Clojure because they are lightweight and convenient to use and create. You just need to use the colon character, :, at the beginning of a word to create a keyword:
user=> :foo :foo user=> :another_keyword :another_keyword
They don't refer to anything else like symbols do; as you can see in the preceding example, when evaluated, they just return themselves. Keywords are typically used as keys in a key-value associative map, as we will see in the next topic about collections.
In this section, we went through simple data types such as string, numbers, Boolean, symbols, and keywords. We highlighted how their underlying implementation depends on the host platform because Clojure is a hosted language. In the next section, we will see how those values can aggregate to collections.
Change the font size
Change margin width
Change background colour