So far we have been dealing with numeric and boolean datatypes. In this section we will look at character representation and how Julia handles ASCII and UTF-8 strings of characters. We will also introduce the concept of regular expressions, widely used in pattern matching and filtering operations.
Julia has a built-in type Char
to represent a character. A character occupies 32 bits not 8, so a character can represent a UTF-8 symbol and may be assigned in a number of ways:
julia> c = 'A' julia> c = char(65) julia> c = '\U0041'
All these represent the ASCII character capital A
.
It is possible to specify a character code of '\Uffff'
but char
conversion does not check that every value is valid. However, Julia provides an isvalid_char()
function:
julia> c = '\Udff3'; julia> is_valid_char(c; ) # => gives false.
Julia uses the special C-like syntax for certain ASCII control characters such as '\b'
,'\t'
,'\n'
,'\r'
,\'f'
for backspace, tab, newline, carriage...