Interpreters and compilers read programs that are formulated in a programming language. They either execute them directly (interpreters) or first convert them into a machine language or another programming language (compilers). Both interpreters and compilers usually have (among others) two components called lexer and parser.
An interpreter may omit the code generation and run the parsed program directly without a dedicated compilation step.
The lexer (also called scanner or tokenizer) dissects an input program into its smallest possible parts, the so-called tokens. Each token consists of a token class (for example, numerical value or variable identifier) and the actual token contents. For example, a lexer for a calculator given the input string 2 + (3 * a)
might generate the following list of tokens (each having a token class and value):
Number ("
2
")Addition operator ("
+
")Opening bracket ("
(
")Number...