Python - Lexical Structure Flashcards
Review Major Concepts of Lexical Structure in the Python Programming Language
What is Lexical Structure in a programming language?
The lexical structure of a programming language is the set of basic rules that govern how you write programs in that language.
Lexical Structure is the lowest-level syntax of the language.
Research: Syntax in programming languages in general
Wikipedia: In computer science, the syntax of a computer language is the rules that define the combinations of symbols that are considered to be correctly structured statements or expressions in that language.
https://en.wikipedia.org/wiki/Syntax_(programming_languages)
Why is Lexical structure important?
Lexical Structure are rules designed to organize the Python Language Code, written within a source code file, which is essential for writing readable and executable code
List and Briefly Describe: Lexical Structures in the Python Programming Language
hint: 11 terms are listed.
Lines: the basic structure of python code that helps separate statements from each other.
Indentations: the basic structure of python code that helps connect instructions when they span multiple lines.
Comments: writing that is not considered code and usually used for documentation purposes.
Character Sets: Python uses a set of characters called Unicode (UTF-8). Another character set can be used to write Python code when using the encoding declaration (# coding: iso-8859-1) in a source file.
Tokens: Python breaks each line of Python code into a sequence of lexical components known as tokens.
Identifier Tokens: An identifier is a name used to specify a variable, function, class, module, or other object
Keyword Tokens: Python has 35 keywords, or identifiers that it reserves for special syntactic uses and cannot use keywords as regular identifiers
Operators Tokens: non-alphanumeric characters and character combinations used to perform actions on other tokens.
Delimiters Tokens: characters and combinations of characters used to group and organize code.
Literals Tokens: A literal is the direct denotation in a program of a data value (a number, string, or container).
Statements:
Research: Tokens for programming languages in general
Wikipedia: Lexical tokenization is conversion of a text into (semantically or syntactically) meaningful lexical tokens belonging to categories defined by a “lexer” program… In the case of a programming language, the categories include identifiers, operators, grouping symbols and data types.
https://en.wikipedia.org/wiki/Lexical_analysis
Describe: Tokens
Python breaks each logical line into a sequence of elementary lexical components known as tokens. Each token corresponds to a substring of the logical line. The normal token types are identifiers, keywords, operators, delimiters, and literals
Describe: The use of Lines in Lexical Structure of the Python Programming Language.
Python code is broken up (i.e. processed) by a Lexical Analyzer that starts by searching across physical lines of the text file to find logical lines of code.
In practice, we write a code statement as a single Logical Line that can be broken up and can span across one or more Physical Lines.
The Python lexical analyzer breaks logical lines of source code into tokens before feeding it to the parser.
References
The Python Language Reference 1
Python in a Nutshell 2
1 The Python Language Reference https://docs.python.org/3.9//reference/lexical_analysis.html
2 Python in a Nutshell, 4th Edition (see Lexical Struture section in Chapter 3)
List and Describe: types of line joining
Line Joining in Python is a way for logical lines of code to span across multiple physical lines. There are two types of line joining: explicit line joining & implicit line joining.
Explicit Line Joining: when a physical line ends in a backslash that is not part of a string literal or comment, it is joined with the following physical line forming a single logical line.
Implicit Line Joining: Expressions in parentheses, square brackets or curly braces can be split over more than one physical line without using backslashes.
Research: Block Structure for programming languages in general.
hint: search for ‘Programming Language Block Structure’
Wikipedia: In computer programming, a block or code block or block of code is a lexical structure of source code which is grouped together… Blocks have two functions: to group statements so that they can be treated as one statement, and to define scopes for names to distinguish them from the same name used elsewhere.
https://en.wikipedia.org/wiki/Block_(programming)
Describe: Indentations
Python uses indentation to express the block structure of a program.
Research: Unicode
Describe: the use of Character Sets in the Python Programming Language
A Python source file can use any Unicode character, encoded by default as UTF-8. You may choose to tell Python that a certain source file is written in a different encoding.
Describe: Identifiers
An identifier is a name used to specify a variable, function, class, module, or other object
What are the rules for creating an identifier?
An identifier starts with a letter (that is, any character that Unicode classifies as a letter) or an underscore (_), followed by zero or more letters, underscores, digits, or other characters that Unicode classifies as letters, digits, or combining marks. Case is significant: lowercase and uppercase letters are distinct. Punctuation characters such as @, $, and ! are not allowed in identifiers.
Describe: Keywords
Python has 35 keywords, or identifiers that it reserves for special syntactic uses.