On Programming

1.2. On Programming#

You need to know a little about a lot of things to become an expert. To become an expert in Python, some of these little stuff are like:

This chapter introduces the fundamental concepts of programming using Python as our teaching language. Understanding these core concepts is essential for any programmer, as they form the foundation upon which all programming skills are built:

Programming Languages
Interpreted vs Compiled
Programming Constructs
Expressions and Statements
Number Systems
Resources

This chapter is all bout the constructs of programming. This vocabulary is important: you will need it to understand computational materials, design solutions to computational problems, and communicate effectively with other programmers. Also, these concepts are universal across most programming languages, with syntax and specific implementations may differ. Learning these topics will give you a solid foundation for learning other languages and tackling more advanced programming challenges.

1.2.1. Programming Languages#

Learning to program means learning a new way of thinking – thinking like a computer scientist. This approach combines some of the best features of mathematics, engineering, and the natural sciences. Like mathematicians, computer scientists use formal languages to denote ideas – specifically computations. Like engineers, they design things, assembling components into systems and evaluating trade-offs among alternatives. Like scientists, they observe the behavior of complex systems, form hypotheses, and test predictions.

1.2.1.1. Natural vs Formal Languages#

Natural languages are the languages that people use to communicate, such as English, Spanish, and French. They were not designed by people; they evolved naturally. Formal languages are languages that are designed by people for specific applications. For example, the notation that mathematicians use is a formal language particularly well suited to denoting relationships among numbers and symbols. Similarly, programming languages are formal languages designed to express computations. Although formal and natural languages have some features in common, there are important differences:

Ambiguity: Natural languages are full of ambiguity, which people deal with by using contextual clues and other information. Formal languages are designed to be nearly or completely unambiguous, which means that any program has exactly one meaning, regardless of context.
Redundancy: In order to make up for ambiguity and reduce misunderstandings, natural languages use redundancy. As a result, they are often verbose. Formal languages are less redundant and more concise.
Literalness: Natural languages are full of idiom and metaphor. Formal languages mean exactly what they say.

Because we all grow up speaking natural languages, it is sometimes hard to adjust to formal languages. Formal languages are denser than natural languages, so it takes longer to read them. Additionally, the structure is important, so it is not always best to read from top to bottom or left to right. Finally, the details matter. Small errors in spelling and punctuation, which you can get away with in natural languages, can make a big difference in a formal language.

1.2.1.2. Levels of Abstraction#

Programming languages exist on a spectrum, ranging from those closest to hardware (machine code) to those closest to high-level languages that are more human-readable, as shown below:

Level	Description	Examples	Code Example
High-Level Languages	Closest to human language, abstracted from hardware details	Python, Java, JavaScript, C, C++	`print("Hello, World!")`
Low-Level Languages	Closer to machine code, direct hardware manipulation	Assembly language	`mov eax, 1` `msg db 'Hello, World!', 0xA`
Machine Code	Binary instructions directly executed by CPU	Binary (0s and 1s)	`10110000 00000000` `10110001 00001010`

1.2.2. Interpreted vs. Compiled#

Programming languages also differ in how they are executed. High-level languages need to be either interpreted or compiled before running. In the old days, interpreted languages (like Python, JavaScript, and Ruby) were executed line by line by an interpreter at runtime, which translated and ran the code on the fly. Compiled languages (such as C, C++, and Rust), on the other hand, require a compilation step in which the entire source code is translated into machine code before execution, resulting in faster runtime performance. Modern time languages (like Python, Java, and C#) use a hybrid approach, compiling code to an intermediate bytecode that is then interpreted or just-in-time compiled by a virtual machine, combining the benefits of both models.

Interpreter Execution (Direct)
┌───────────┐   ┌───────────┐   ┌──────┐
│Source Code│ → │Interpreter│ → │Output│
└───────────┘   └───────────┘   └──────┘

Compiler Execution (Native)
┌───────────┐   ┌────────┐   ┌───────────┐   ┌────────┐   ┌──────┐
│Source Code│ → │Compiler│ → │Object Code│ → │Executor│ → │Output│
└───────────┘   └────────┘   └───────────┘   └────────┘   └──────┘

Python (CPython) Execution
              ┌─────────────────────────────┐
              │      Python Interpreter     │
┌─────────┐   │┌────────┐              ┌───┐│   ┌──────┐   
│script.py│ → ││Compiler│ → Bytecode → │PVM││ → │Output│
└─────────┘   │└────────┘   (.pyc)     └───┘│   └──────┘
              └─────────────────────────────┘

1.2.3. Programming Constructs#

A computer program is a set of instructions (written in the specific notations specified by a programming language) given to computers. Interestingly, there are only a few key concepts we need to know when learning to give instructions to computers, applicable to most programming languages. These basic control structure constructs include:

Sequence: instructions are executed one after another (sequential execution).
Selection: decision-making/control structure; namely, choosing between alternative paths of actions within a program.
Iteration: code repetition; either count-controlled or condition-controlled.

In addition to the three basic programming constructs, programming languages have construct elements such as:

Subroutine: blocks of code (function/method) in a modular program performing a particular task.
Nesting: Selection and iteration constructs can be nested within each other.
Variable: a named computer memory location that stores values
Data type (Type): a classification of data values specifying the values and operations on the values.
Operator: symbols that perform operations on one or more operands.
Array: storing multiple values of the same data type in a single variable, aka, data collections.

1.2.4. Expressions and Statements#

In programming languages, expressions and statements are fundamental building blocks for formulating and using the language.

By definition, an expression is a combination of values, variables, operators, and function calls that the Python interpreter can evaluate to produce a single value, which may be assigned to a variable for later use. Note that a single literal value, like an integer or string, can be an expression.

An expression may contain operators and operands, such as a + b * c, as shown below.

A statement is a complete code of instruction for the interpreter to execute an action or control the flow of the program. They do not evaluate to a value that can be used elsewhere, like an expression. For example, an assignment statement creates a variable and gives it a value, but the statement itself has no value.

Computing the value of an expression is called evaluation; whereas running a statement is called execution. So, a statement performs an action. An expression computes a value. For example:

Type	Example	Description
Statement	`x = 5`	Assignment statement: Assigns 5 to `x` (changes program state). Produces no value.
	`print(x)`	Print statement: Prints something to the screen (has an effect); no value.
	`if x > 0:`	`if` statement: Begins a conditional block — a control flow structure; no value.
	`import math`	Import the functionalities from the `math` module; no value.
Expression	`2 + 3`	Produces the value `5`.
	`x * y`	Computes a value based on `x` and `y`.
	`len("data")`	Evaluates to `4`.

x = 5                # statement: assigns a value; nothing is displayed
x + 5                # expression: evaluates to 10, so the REPL/notebook shows 10
if x > 0:            # statement: controls flow; no value, only the side effect below
    print("x is positive")   # a block of code that executes if the condition is true

x is positive

1.2.5. Number Systems#

Advanced Topic

This section covers number systems (binary, octal, hexadecimal) which is more advanced material. While useful for understanding how computers work at a lower level, it’s not essential for writing most Python programs. Feel free to skim this section on first reading and return to it later when you need to work with different number bases.

In programming, number systems are ways of representing numbers using different bases. Computers store and process data in binary, but programmers often use other bases for convenience, readability, or hardware interaction. The four main number systems used in programming are binary (base-2), decimal (base-10), hexadecimal (base-16), and octal (base-8):

System	Base	Digits Used	Typical Use	Python Example (all = 100)
Binary	2	0–1	Hardware, CPU, memory, bitwise operations	`0b1100100` → 100
Octal	8	0–7	Unix file permissions	`0o144` → 100
Decimal	10	0–9	Human-friendly math, user input/output	`100` → 100
Hexadecimal	16	0–9, A–F	Memory addresses, colors, debugging, networking	`0x64` → 100

As you can see, binary literals start with 0b, octal literals start with 0o, and hexadecimal literals start with 0x.

All number systems use positional notation, where each digit’s value in a number depends on its position and its base. For example, decimal number 100, or 100 (base 10), can be represented as shown in the table below. Note that each digit represents a different fold of the base and, therefore, its corresponding value.

digit	position	digit x base^position			value
3	2	3 × 10^2	= 1 × 100	=	300
4	1	4 × 10^1	= 0 × 10	=	40
5	0	5 × 10^0	= 0 × 1	=	5
					345

Here, you see that the digit 1 in 100 means 100 because it is in the hundred’s (10^2 because it’s base 10) place. Therefore, we see that:

digit value = digit x (base ^ position)

You then add all the digit values together to get the value of the number:

345 = 3×10² + 4×10¹ + 5×10⁰

Following the same process of adding up the digital values, let’s say we have a number, 1011 (base 2), we can get its decimal value by:

Digit	Position	2 to the nth Power	Value
1	3	2³	8
0	2	2²	0
1	1	2¹	2
1	0	2⁰	1
			11

So, we can do base conversion from (base 2) to (base 10) by:

1011₂ = 1x2^3 + 0 x 2^2 + 1 x 2^1 + 1 x 2^0 = 8 + 2 + 1 = 11₁₀

Or, let us put the place values at the top, which I prefer:

Position	2^3	2^2	2^1	2^0
Place value	8	4	2	1
Digit	1	0	1	1
Calculation	1×8	0×4	1×2	1×1
Value	8	0	2	1	11

The base 2 system is commonly known as the basis of computing. To count from 0 to 5 (base 10) in binary:

= 0b0000
= 0b0001
= 0b0010
= 0b0011
= 0b0100
= 0b0101

To graphically see that the number 100 (base 10) is equal to 1100100 (base 2) (or 0b1100100, where b stands for binary):

0b1100100
  ││││││└ 0 × 2^0 = 0 × 1  = 0
  │││││└─ 0 × 2^1 = 0 × 2  = 0
  ││││└── 1 x 2^2 = 1 × 4  = 4
  │││└─── 0 × 2^3 = 0 × 8  = 0
  ││└──── 0 × 2^4 = 0 × 16 = 0
  │└───── 1 × 2^5 = 1 × 32 = 32
  └────── 1 × 2^6 = 1 × 64 = 64
                             __
                             100

Python has built-in functions bin(), oct(), hex(), and int() for base conversion between number systems, which are prefixed by 0x, 0o, and 0h. Note that the int() function in this case requires a base. Additionally, Python recognizes other number systems and automatically converts numbers into base 10 when evaluated.

num_b = bin(100)        # '0b11000100'
num_o = oct(100)        # '0o144'
num_h = hex(100)        # '0x64', converted from 100
num_h2 = hex(0b1100100) # '0x64', converted from base 2
num_i_h = int(num_h, 16) # '100'
num_i_b = int(num_b, 2)  # '100'

print(num_b, num_o, num_h, num_h2, num_i_h, num_i_b, sep="\n")

### Exercise 
# Q1. What's the value of 10 (base 10) in binary? (Print it as a string if you use it as a literal)
# Q2. What's the value of decimal 64 in base 16?
# Try to produce the same output as the cell below. # You may need to use the print() function.
### Your code starts here



### Your code stops here

0b1010
0x40

1.2.5.1. Character Encoding#

For computers, the smallest unit of data is a bit (Binary Digit). A bit can only be 0 or 1, which can represent off/on, false/true, or no voltage/voltage. A byte, on the other hand, is a group of 8 bits, which can represent 2^8, which is 256, different values (0-255), and is the fundamental addressable unit in modern computing.

Computers only process machine code. For humans to talk to computers, we need something in between that’s understood by both, and that is encoding. For example, letter A is represented as 65 (base 10) or 0b1000001 in the ASCII (American Standard Code for Information Interchange) code table. ASCII encoding covers English characters (including special characters, numbers, and the alphabet). An early version of the ASCII table is the MIL-STD-188-100:

In this chart, you can see that letter A is of binary bits 1000 001. When comparing string/character literals, we say that ‘B’ is greater than ‘B’ because of the encoding (the ASCII value of ‘B’ is 66, which is greater than the ASCII value of ‘B’, 65).

Since the ASCII code only represents English characters, the Unicode Standard and the standard Unicode Transformation Format (UTF) schemes were proposed to support the use of text in all of the world’s writing systems that can be digitized; among them, [UTF-8](https://en.wikipedia.org/wiki/UTF-8 , which is the dominant encoding system for all languages on the internet, and is supported by all modern operating systems and programming languages.

ASCII uses 1 byte (7 bits originally and 8 bits for extended ASCII) to represent each character for its standard 128 characters, while UTF-8 is variable-length, using 1 to 4 byte code units (8 to 32 bits) to support 1,112,064 code points, while also encoding standard ASCII characters in just 1 byte for backwards-compatibility. With the large number of code points supported, UTF-8 is able to represent emojis and East Asian language characters.

### Exercise 
### What's the decimal code for the letter "C" in the ASCII code? Save it to a variable named c_dec.
### What's the binary code for the letter "C" in the ASCII code? Save it to a variable named c_bin using the bin() function.
### Try to produce the same output as the cell below. Use the print() function and escape sequences.
### Your code starts here




### Your code stops here

The decimal code for the letter "C" is 67.
The binary code for the letter "C" is 0b1000011.

1.2.6. Resources#

Official Python Documentation

The Python Standard Library - Modules and APIs that ship with Python
The Python Language Reference - Formal spec for syntax and semantics
Python Tutorial (Official) - Guided introduction from the core docs team
PEP 20 - The Zen of Python - 19 guiding aphorisms for Python design
Python Package Index (PyPI) - Repository of third-party Python packages

Style Guides and Best Practices

PEP 8 - Style Guide for Python Code - Canonical formatting and naming conventions
Google Python Style Guide - Industry-standard style guide used at Google; differs from PEP8 in places such as forbidding wildcard imports (from X import *) and more structured docstrings for functions/methods.

Tutorials

Real Python - Tutorials and articles on Python programming
Python for Everybody - Free interactive textbook and course
Automate the Boring Stuff with Python - Practical programming for beginners

Interactive Practice

Python Tutor - Visualize code execution step by step
LeetCode - Coding practice and interview preparation
HackerRank Python - Practice problems and challenges