Basic data types
In this section, we will explore the basic data types. The Common Language Infrastructure (CLI) defines a set of standard types and operations that are supported by all programming languages targeting the CLI. These data types are provided in the System
namespace. All of them, however, have a C# alias. These aliases are keywords in the C# language, which means they can only be used in the context of their designated purpose and not elsewhere, such as variable, class, or method names. The C# name and the .NET name, along with a short description of each type, are listed in the following table (listed alphabetically by the C# name):
The types listed in this table are called simple types or primitive types. Apart from these, there are two more built-in types:
Let's explore all of the primitive types in detail in the following sections.
The integral types
C# supports eight integer types that represent various ranges of integral numbers. The bits and range of each of them are shown in the following table:
As shown in the preceding table, C# defines both signed and unsigned integer types. The major difference between signed and unsigned integers is the way in which the high order bit is read. In the case of a signed integer, the high order bit is considered the sign flag. If the sign flag is 0, then the number is positive but if the sign flag is 1, then the number is negative.
The default value of all integral types is 0. All of these types define two constants called MinValue
and MaxValue
, which provide the minimum and maximum value of the type.
Integral literals, which are numbers that appear directly in code (such as 0, -42, and so on), can be specified as decimal, hexadecimal, or binary literals. Decimal literals do not require any suffix. Hexadecimal literals are prefixed with 0x
or 0X
, and binary literals are prefixed with 0b
or 0B
. An underscore (_
) can be used as a digit separator with all numeric literals. Examples of such literals are shown in the following snippet:
int dec = 32; int hex = 0x2A; int bin = 0b_0010_1010;
An integral value without any suffix is inferred by the compiler as int
. To indicate a long integer, use l
or L
for a signed 64-bit integer and ul
or UL
for an unsigned 64-bit integer.
The floating-point types
The floating-point types are used to represent numbers having fractional components. C# defines two floating-point types, as shown in the following table:
The float
type represents a 32-bit, single-precision floating-point number, whereas double
represents a 64-bit, double-precision floating-point number. These types are implementations of the IEEE Standard for Floating-Point Arithmetic (IEEE 754), which is a standard established by the Institute of Electrical and Electronics Engineers (IEEE) in 1985 for floating-point arithmetic.
The default value for floating-point types is 0. These types also define two constants called MinValue
and MaxValue
that provide the minimum and maximum value of the type. However, these types also provide constants that represent not-a-number (System.Double.NaN
) and infinity (System.Double.NegativeInfinity
and System.Double.PositiveInfinity
). The following code listing shows several variables initialized with floating-point values:
var a = 42.99; float b = 19.50f; System.Double c = -1.23;
By default, a non-integer number such as 42.99
is considered a double. If you want to specify this as a float type, then you need to suffix the value with the f
or F
character, such as in 42.99f
or 42.99F
. Alternatively, you can also explicitly indicate a double literal with the d
or D
suffix, such as in 42.99d
or 42.99D
.
Floating-point types store fractional parts as inverse powers of two. For this reason, they can only represent exact values such as 10
, 10.25
, 10.5
, and so on. Other numbers, such as 1.23
or 19.99
, cannot be represented exactly and are only an approximation. Even if double
has 15 decimal digits of precision, as compared to only 7 for float
, precision loss starts to accumulate when performing repeated calculations.
This makes double
and float
difficult or even inappropriate to use in certain types of applications, such as financial applications, where precision is key. For this purpose, the decimal
type is provided.
The decimal type
The decimal
type can represent up to 28 decimal places. The details for the decimal type are shown in the following table:
The default value for the decimal type is 0. MinValue
and MaxValue
constants that define the minimum and maximum value of the type are also available. A decimal
literal can be specified using the m
or M
suffix as shown in the following snippet:
decimal a = 42.99m; var b = 12.45m; System.Decimal c = 100.75M;
It is important to note that the decimal
type minimizes errors during rounding but does not eliminate the need for rounding. For instance, the result of the operation 1m / 3 * 3
is not 1 but 0.9999999999999999999999999999
. On the other hand, Math.Round(1m / 3 * 3)
yields the value 1.
The decimal
type is designed for use in applications where precision is key. Floats and doubles are much faster types (because they use binary math, which is faster to compute), while the decimal
type is slower (as the name implies, it uses decimal math, which is slower to compute). The decimal
type can be an order of magnitude slower than the double
type. Financial applications, where small inaccuracies can accumulate to important values over repeated computations, are a typical use case for the decimal
type. In such applications, speed is not important, but precision is.
The char type
The character type is used to represent a 16-bit Unicode character. Unicode defines a character set that is intended to represent the characters of most languages in the world. Characters are represented by enclosing them in single quotation marks (''
). Examples of this include 'A'
, 'B'
, 'c'
and '\u0058'
:
Character values can be literals, hexadecimal escape sequences that have the form '\xdddd'
, or Unicode representations that have the form '\udddd'
(where dddd
is a 16 hexadecimal value). The following listing shows several examples:
char a = 'A'; char b = '\x0065'; char c = '\u15FE';
The default value for the char
type is decimal 0, or its equivalents, '\0'
, '\x0000'
, or '\u0000'
.
The bool type
C# uses the bool
keyword to represent the Boolean type. It can have two values, true
or false
, as shown in the following table:
The default value for the bool type is false
. Unlike other languages (such as C++), integer values or any other values do not implicitly convert into the bool
type. A Boolean variable can be either assigned a Boolean literal (true
or false
) or an expression that evaluates to bool
.
The string type
A string is an array of characters. In C#, the type for representing a string is called string
and is an alias for the .NET System.String
. You can use any of these two types interchangeably. Internally, a string contains a read-only collection of char
objects. This makes strings immutable, which means that you cannot change a string but need to create a new one every time you want to modify the content of an existing string. Strings are not null-terminated (unlike other languages such as C++) and can contain any number of null characters ('\0'
). The string length will contain the total number of the char
objects.
Strings can be declared and initialized in a variety of ways, as shown here:
string s1; // unitialized string s2 = null; // initialized with null string s3 = String.Empty; // empty string string s4 = "hello world"; // initialized with text var s5 = "hello world"; System.String s6 = "hello world"; char[] letters = { 'h', 'e', 'l', 'l', 'o'}; string s7 = new string(letters); // from an array of chars
It is important to note that the only situation when you use the new
operator to create a string object is when you initialize it from an array of characters.
As mentioned before, strings are immutable. Although you have access to the characters of the string, you can read them, but you cannot change them:
char c = s4[0]; // OK s4[0] = 'H'; // error
The following are the methods that seem to be modifying a string:
Remove()
: This removes a part of the string.ToUpper()
/ToLower()
: This converts all of the characters into uppercase or lowercase.
Neither of these methods modifies the existing string, but instead returns a new one.
In the following example, s6
is the string defined earlier, s8
will contain hello
, s9
will contain HELLO WORLD
, and s6
will continue to contain hello world
:
var s8 = s6.Remove(5); // hello var s9 = s6.ToUpper(); // HELLO WORLD
You can convert any built-in type, such as integer or floating-point numbers, into a string using the ToString()
method. This is actually a virtual method of the System.Object
type, that is, the base class for any .NET type. By overriding this method, any type can provide a way to serialize an object to a string:
int i = 42; double d = 19.99; var s1 = i.ToString(); var s2 = d.ToString();
Strings can be composed in several ways:
- It can be done using the concatenating operator,
+
. - Using the
Format()
method: The first argument of this method is the format, in which each parameter is indicated positionally with the index specified in curly braces, such as{0}
,{1}
,{2}
and so on. Specifying an index beyond the number of arguments results in a runtime exception. - Using string interpolation, which is practically a syntactic shortcut for using the
String.Format()
method: The string must be prefixed with$
and the arguments are specified directly in curly braces.
An example of all of these methods is shown here:
int i = 42; string s1 = "This is item " + i.ToString(); string s2 = string.Format("This is item {0}", i); string s3 = $"This is item {i}";
Some characters have a special meaning and are prefixed with a backslash (\
). These are called escaped sequences. The following table lists all of them:
Escape sequences are necessary in certain cases, such as when you specify a Windows file path or when you need a text that spawns multiple lines. The following code shows several examples where escape sequences are used:
var s1 = "c:\\Program Files (x86)\\Windows Kits\\"; var s2 = "That was called a \"demo\""; var s3 = "This text\nspawns multiple lines.";
You can, however, avoid using escape sequences by using verbatim strings. These are prefixed with the @
symbol. When the compiler encounters such a string, it does not interpret escape sequences. If you want to use quotation marks in a string when using verbatim strings, you must double them. The following sample shows the preceding examples rewritten with verbatim strings:
var s1 = @"c:\Program Files (x86)\Windows Kits\"; var s2 = @"That was called a ""demo"""; var s3 = @"This text spawns multiple lines.";
Prior to C# 8, if you wanted to use string interpolation with verbatim strings, you had to first specify the $
symbol for string interpolation and then @
for verbatim strings. In C# 8, you can specify these two symbols in any order.
The object type
The object
type is the base type for all other types in C#, even though you do not specify this explicitly, as we will see in the following chapters. The object
keyword in C# is an alias for the .NET System.Object
type. You can use these two interchangeably.
The object
type provides some basic functionalities to all other classes in the form of several virtual methods that any derived class can override, if necessary. These methods are listed in the following table:
Apart from these, the object
class contains several other methods. An important one to note is the GetType()
method, which is not virtual and which returns a System.Type
object with information about the type of the current instance.
Another important thing to notice is the way the Equals()
method works because its behavior is different for reference and value types. We have not covered these concepts yet but will do so later in this chapter. For the time being, keep in mind that, for reference types, this method performs reference equality; this means it checks whether the two variables point to the same object on the heap. For value types, it performs value equality; this means that the two variables are of the same type and that the public and private fields of the two objects are equal.
The object
type is a reference type. The default value of a variable of the object
type is null
. However, a variable of the object
type can be assigned any value of any type. When you assign a value type value to object
, the operation is called boxing. The reverse operation of converting the value of object
into a value type is called unboxing. This will be detailed in a later section in this chapter.
You will learn more about the object
type and its methods throughout this book.