C Tokens

In C programming, a token is the smallest unit or building block of a program. Tokens are used to represent different elements in the source code, such as keywords, identifiers, constants, operators, and punctuation symbols. The C compiler breaks down the source code into tokens during the lexical analysis phase. Here are the main categories of tokens in C:

1. Keywords:
  • Keywords are reserved words in C that have predefined meanings. They are used to define the structure and behavior of a program.
  • Examples: int, for, if, while, return, break, else, etc.
2. Identifiers:
  • Identifiers are names given to various program elements such as variables, functions, arrays, and user-defined types.
  • Rules for identifiers: They must start with a letter (a-z, A-Z) or an underscore (_), followed by letters, digits, or underscores. They are case-sensitive.
  • Examples: main, variableName, sum, myFunction, MAX_VALUE, etc.
3. Constants:
Constants represent fixed values in a program. Types of constants:
  • Integer Constants: Represent whole numbers (e.g., 42, -123).
  • Floating-Point Constants: Represent real numbers (e.g., 3.14, -0.001).
  • Character Constants: Represent single characters (e.g., 'A', '5').
  • String Constants: Represent sequences of characters (e.g., "Hello").
  • Enumeration Constants: Represent user-defined named values.
  • Constants defined using #define or const (e.g., #define PI 3.14159).
4. String Literals:
  • String literals are sequences of characters enclosed in double quotes (").
  • Example: "Hello, world!".
5. Operators:
  • Operators are symbols used to perform operations on operands.
  • Types of operators: Arithmetic operators (+, -, *, /, %), Assignment operators (=, +=, -=), Comparison operators (==, !=, <, >, <=, >=), Logical operators (&&, ||, !), Bitwise operators (&, |, ^, ~, <<, >>), and more.
6. Punctuation Symbols:
  • Punctuation symbols are characters used to separate and punctuate code elements.
  • Examples: Semicolon (;), Comma (,), Period (.), Parentheses (( and )), Braces ({ and }), Square Brackets ([ and ]), and more.
7. Comments:
  • Comments are not considered part of the executable code but are used to add explanations and documentation to the code.
  • Types of comments: Single-line comments (//) and multi-line comments (/* */).
8. Preprocessor Directives:
  • Preprocessor directives are used to control the behavior of the preprocessor, which is a separate phase before actual compilation.
  • Example: #include (used to include a header file).
9. Whitespace:
  • Whitespace includes spaces, tabs, and newline characters that are used to separate tokens and improve code readability.

Tokens are the fundamental components of C source code, and the C compiler uses them to analyze and translate the program into machine code. Understanding tokens is crucial for writing, reading, and debugging C programs.