SheCodes
Lab.
Python and C++, written side by side. Built by an aerospace engineer for engineers and developers who need to actually use the language.
Aerospace & Aeronautical Engineer
What is Programming?
Before writing any code, you need one mental model. This page gives you that model.
| What you want to build | Language to use |
|---|---|
| Websites and web apps | Python (backend), JavaScript (frontend) |
| Data analysis, graphs, reports | Python (pandas, matplotlib) |
| Games | C++ (Unreal Engine), C# (Unity) |
| Engineering simulations, FEA | C++, Python (NumPy/SciPy) |
| Mobile apps (iPhone/Android) | Swift (iPhone), Kotlin (Android) |
| Machine learning / AI | Python (TensorFlow, PyTorch) |
| Embedded systems / microcontrollers | C++, C |
| Automate repetitive computer tasks | Python |
Setting Up Your Computer
Two things to install: Python and an editor. Follow these steps.
2. Click the yellow "Download Python" button
3. Open the downloaded file and follow the installer
4. Done.
2. Download the installer
3. Open it: tick "Add Python to PATH" before clicking Install
4. Done.
python3 --version
If not: sudo apt install python3
python --version then press Enter. You should see something like Python 3.12.0. If you do, Python is installed correctly.# Find out where you are (your current folder) pwd # Mac/Linux cd # Windows (just typing cd shows your location) # List files in the current folder ls # Mac/Linux dir # Windows # Move into a folder cd Documents cd Documents/my_code # Go back up one folder cd .. # Run a Python file python hello.py python3 hello.py # if the above doesn't work # Install a Python library pip install numpy pip install pandas matplotlib scipy
| Editor | Best for | Download |
|---|---|---|
| VS Code | The most widely used editor. Works for Python, C++, and everything else. | code.visualstudio.com |
| PyCharm Community | Python-specific. More features for Python, slightly heavier. | jetbrains.com/pycharm |
| Jupyter Notebook | Data analysis and engineering. Run code block by block, see results inline. | pip install notebook |
| Thonny | Total beginners. Simple interface, shows you exactly what each line does. | thonny.org |
# 1. Open VS Code # 2. File → New File → save it as: hello.py # (the .py tells the computer it's Python) # 3. Type this into the file: print("Hello! I am learning to code.") print("This is my first program.") # 4. Open the terminal in VS Code: View → Terminal # 5. Type: python hello.py and press Enter # 6. You should see: # Hello! I am learning to code. # This is my first program. # Congratulations — you just ran your first program.
How Computers Think
You don't need to know electronics. But understanding how a computer works changes how you read error messages, write loops, and think about memory.
age, Python looks up that slot and gives you what's inside.
Glossary: Every Term Explained
Every technical term used on this site. Searchable, grouped by topic.
| Term | Plain English explanation |
|---|---|
| Algorithm | A step-by-step method for solving a problem. A recipe is an algorithm. A sorting method is an algorithm. |
| Argument | A value you pass into a function when calling it. In print("hello"), the argument is "hello". |
| Array | An ordered collection of values stored together, accessed by number index. Index starts at 0. |
| Boolean | A value that is either True or False. Used in conditions: if x > 5 evaluates to True or False. |
| Bug | A mistake in code that causes it to behave incorrectly. Named after a real moth found in a computer in 1947. |
| Class | A blueprint for creating objects. Defines what data they hold and what they can do. |
| Compile | Translate your entire source code into machine code before running. C++ is compiled. Python is interpreted. |
| Condition | An expression that evaluates to True or False, used to make decisions. age >= 18 is a condition. |
| Crash | When a program stops unexpectedly due to an error it couldn't handle. |
| Data type | The kind of value a variable holds: integer, float, string, boolean. Determines what you can do with it. |
| Debug | Finding and fixing errors (bugs) in code. One of the most important and time-consuming parts of programming. |
| Dictionary | A collection of key-value pairs. Like a real dictionary: look up a word (key) to get its definition (value). |
| Exception | An error that occurs while the program is running. Can be caught with try/except (Python) or try/catch (C++). |
| Float | A number with a decimal point. 3.14, -0.5, 1.0 are all floats. |
| Function | A named block of code that does one job. You define it once and call it by name whenever needed. |
| IDE | Integrated Development Environment: a code editor with extra tools like error highlighting and a debugger. VS Code, PyCharm. |
| Index | The position number of an item in a list or array. Starts at 0, not 1. list[0] is the first item. |
| Integer | A whole number with no decimal point. 1, -5, 1000 are integers. |
| Interpret | Read and execute code one line at a time as the program runs. Python is interpreted. Slower but easier to use. |
| Library | A collection of pre-written code you can import and use. import numpy loads thousands of maths functions someone else wrote. |
| Loop | A block of code that repeats. for loops repeat a set number of times. while loops repeat until a condition is false. |
| Method | A function that belongs to an object or class. Called with a dot: text.upper() calls the upper method on a string. |
| NaN | "Not a Number": a special floating-point value that appears when maths produces an undefined result. Poisons downstream calculations. |
| None / null | A special value meaning "nothing" or "no value". Python uses None, C++ uses nullptr (for pointers) or 0. |
| Object | An instance of a class. A specific thing created from a blueprint. alice = Person("Alice", 22): alice is an object. |
| Parameter | A variable in a function definition that receives a value when the function is called. def greet(name):: name is a parameter. |
| Pointer | C++ only. A variable that stores the memory address of another variable. Powerful but dangerous if misused. |
Display output to the terminal/console. The first thing every programmer learns. print() in Python, cout in C++. | |
| RAM | Random Access Memory: fast, temporary storage where your program and all its variables live while running. Cleared when program ends. |
| Return | Send a value back from a function. return x * 2 gives the result back to wherever the function was called. |
| Runtime error | An error that happens while the program is running, not when compiling. Division by zero, accessing a missing index. |
| Scope | Where a variable is visible. Local scope = inside one function. Global scope = everywhere in the file. |
| Semicolon | The ; character required at the end of every statement in C++. Forgetting it is the most common C++ beginner error. |
| String | Text data: any sequence of characters wrapped in quotes. "Hello", "42", "Noor" are all strings. |
| Syntax | The grammar rules of a programming language. Every language has its own syntax: the exact symbols and words it understands. |
| Syntax error | You wrote something the language doesn't understand. Often a missing colon, bracket, or quote. Fixed before the program runs. |
| Terminal | A text-based window where you type commands directly to your computer. Also called: command prompt, shell, console. |
| Variable | A named container in memory that holds a value. age = 22 creates a variable called age holding the value 22. |
| Vector | In C++: a dynamic array that can grow in size. vector<int> v;. In maths: a quantity with magnitude and direction. |
| void | C++ only. Means "this function returns nothing." Used when a function just does something rather than calculating a result. |
Hello, World
Every single programmer in history started with this program. It does one thing: print "Hello, World!" on the screen. It sounds trivial: but it teaches you the most important skill in coding: getting your first program to actually run.
print("Hello, World!")
#include <iostream> using namespace std; int main() { cout << "Hello, World!" << endl; return 0; }
print() is a built-in command. Whatever you put inside the parentheses in quotes gets printed to the screen. That's the entire program. One line.
C++: line by line:
#include <iostream>: loads the input/output tools. Without this, C++ can't print anything.using namespace std;: lets you write cout instead of std::cout every time.int main() { }: every C++ program must have a main() function. This is where your code runs.cout << "Hello, World!" << endl;: cout sends text to the screen. endl moves to the next line.return 0;: tells the computer "program finished successfully."
print("Hello, World!") print("I am learning to code.") print("This is my first program.")
int main() { cout << "Hello, World!" << endl; cout << "I am learning to code." << endl; cout << "This is my first program." << endl; return 0; }
endl in C++ means end of line. It moves the cursor to the next line. Python's print() does this automatically after every call.Variables & Data Types
A variable is a named location in memory. Data types define what kind of value it holds and what operations are valid on it.
name = value. In C++ you must first declare what type of value the variable holds before you store anything.# No type needed — Python figures it out name = "Alice" age = 22 height = 1.68 is_coder = True print(name) print(age) print(height) print(is_coder)
// Must declare the type first string name = "Alice"; int age = 22; double height = 1.68; bool isCoder = true; cout << name << endl; cout << age << endl; cout << height << endl; cout << isCoder << endl;
| Type | What it stores | Python | C++ |
|---|---|---|---|
| Integer | Whole numbers, no decimal point | age = 22 | int age = 22; |
| Float / Double | Numbers with a decimal point | height = 1.68 | double height = 1.68; |
| String | Text, always in quotes | name = "Alice" | string name = "Alice"; |
| Boolean | True or False, two values only | done = True | bool done = true; |
{} curly braces. In C++ you chain values with <<.name = "Alice" age = 22 # f-string: put variable in {} print(f"My name is {name}.") print(f"I am {age} years old.") print(f"In 5 years I will be {age + 5}.")
string name = "Alice"; int age = 22; // Chain with << cout << "My name is " << name << "." << endl; cout << "I am " << age << " years old." << endl; cout << "In 5 years: " << age+5 << endl;
| Operator | Meaning | Example | Result |
|---|---|---|---|
| + | Addition | 10 + 3 | 13 |
| - | Subtraction | 10 - 3 | 7 |
| * | Multiplication | 10 * 3 | 30 |
| / | Division | 10 / 3 | 3.333... |
| % | Remainder (modulo) | 10 % 3 | 1 |
| ** | Power (Python only) | 2 ** 8 | 256 |
int values gives an int result: 10 / 3 gives 3, not 3.333. For decimal results, use double for at least one operand: 10.0 / 3 = 3.333.Getting Input from the User
Input lets a program respond to what a user provides at runtime. Here's how both languages handle it.
input() pauses the program, shows a message, and waits. Whatever the user types gets stored in a variable. In C++, cin >> reads what the user typed.name = input("What is your name? ") print(f"Hello, {name}! Welcome.")
string name; cout << "What is your name? "; cin >> name; cout << "Hello, " << name << "! Welcome." << endl;
input() always returns text, even if the user types a number. You must convert it using int() or float(), otherwise you can't do maths with it. C++ handles this automatically if you declare the right type.
# Convert to int so we can do maths age = int(input("How old are you? ")) print(f"In 10 years you will be {age + 10}.") # float() for decimal numbers price = float(input("Price: ")) print(f"With 10% tax: {price * 1.1:.2f}")
// Declare as int — cin reads a number int age; cout << "How old are you? "; cin >> age; cout << "In 10 years: " << age+10 << endl; double price; cout << "Price: "; cin >> price; cout << "With tax: " << price*1.1 << endl;
input() with int(). If you do age = input("Age: ") and the user types 25, then age + 10 gives you an error because you can't add a number to text.age = input("Age: "). The user types 30. What type is age?Making Decisions: If / Else
Control flow determines which code runs based on conditions. This is how programs branch.
age = 18 if age >= 18: print("You can vote.") else: print("Too young to vote.")
int age = 18; if (age >= 18) { cout << "You can vote." << endl; } else { cout << "Too young to vote." << endl; }
print tell Python "this line belongs to the if." In C++, the curly braces do the same job. Miss either and your code breaks.elif (Python) or else if (C++). Only the first true condition runs.score = 75 if score >= 90: print("Grade: A") elif score >= 70: print("Grade: B") elif score >= 50: print("Grade: C") else: print("Grade: F")
int score = 75; if (score >= 90) { cout << "Grade: A" << endl; } else if (score >= 70) { cout << "Grade: B" << endl; } else if (score >= 50) { cout << "Grade: C" << endl; } else { cout << "Grade: F" << endl; }
| Operator | Meaning | True example | False example |
|---|---|---|---|
| == | Equal to | 5 == 5 | 5 == 6 |
| != | Not equal to | 5 != 6 | 5 != 5 |
| > | Greater than | 6 > 5 | 5 > 6 |
| < | Less than | 5 < 6 | 6 < 5 |
| >= | Greater than or equal | 5 >= 5 | 4 >= 5 |
| <= | Less than or equal | 5 <= 5 | 6 <= 5 |
elif in Python equivalent to in C++?Loops: Repeat with Purpose
A loop executes a block of code repeatedly. You define the condition or count: the loop handles the rest.
# Print 0, 1, 2, 3, 4 for i in range(5): print(i) # Print 1 to 5 for i in range(1, 6): print(i)
// Print 0, 1, 2, 3, 4 for (int i = 0; i < 5; i++) { cout << i << endl; } // Print 1 to 5 for (int i = 1; i <= 5; i++) { cout << i << endl; }
int i = 0: start: create a counter variable set to 0i < 5: condition: keep looping while this is truei++: step: add 1 to the counter after each loop (i++ means i = i + 1)while loop keeps running as long as a condition stays true. Use it when you don't know in advance exactly how many times you need to loop.count = 1 while count <= 5: print(f"Count is {count}") count += 1 # count = count + 1 print("Done!")
int count = 1; while (count <= 5) { cout << "Count is " << count << endl; count++; } cout << "Done!" << endl;
count++: the loop runs forever. This is called an infinite loop and it freezes your program.fruits = ["apple", "banana", "cherry"] for fruit in fruits: print(fruit)
string fruits[] = {"apple", "banana", "cherry"}; for (string fruit : fruits) { cout << fruit << endl; }
for i in range(4):Functions: Write Once, Use Anywhere
A function is a named, reusable block of code. Define it once, call it wherever needed.
# Define the function def greet(): print("Hello! Welcome.") # Call it (can call as many times as you want) greet() greet()
// Define the function (void = returns nothing) void greet() { cout << "Hello! Welcome." << endl; } int main() { greet(); // Call it greet(); return 0; }
def greet(name): print(f"Hello, {name}!") greet("Alice") # Hello, Alice! greet("Bob") # Hello, Bob! def add(a, b): print(a + b) add(5, 3) # 8 add(10, 20) # 30
void greet(string name) { cout << "Hello, " << name << "!" << endl; } void add(int a, int b) { cout << a + b << endl; } int main() { greet("Alice"); greet("Bob"); add(5, 3); add(10, 20); return 0; }
def square(n): return n * n result = square(5) print(result) # 25 print(square(8)) # 64 def full_name(first, last): return first + " " + last print(full_name("Ada", "Lovelace"))
int square(int n) { return n * n; } string fullName(string f, string l) { return f + " " + l; } int main() { cout << square(5) << endl; // 25 cout << square(8) << endl; // 64 cout << fullName("Ada", "Lovelace") << endl; return 0; }
void with the return type: int if the function returns a whole number, string if it returns text, double if it returns a decimal.Arrays & Lists
A list (Python) or array (C++) stores multiple values in a single container, accessed by index.
names = ["Alice", "Bob", "Clara"] print(names[0]) # Alice (first item) print(names[1]) # Bob (second item) print(names[2]) # Clara (third item) print(len(names)) # 3 (total count) scores = [88, 92, 76, 95] print(scores[0]) # 88
string names[] = {"Alice", "Bob", "Clara"}; cout << names[0] << endl; // Alice cout << names[1] << endl; // Bob cout << names[2] << endl; // Clara int scores[] = {88, 92, 76, 95}; cout << scores[0] << endl; // 88
scores = [88, 92, 76, 95] total = 0 for s in scores: total += s average = total / len(scores) print(f"Total: {total}") print(f"Average: {average}")
int scores[] = {88, 92, 76, 95}; int total = 0; for (int s : scores) { total += s; } double avg = total / 4.0; cout << "Total: " << total << endl; cout << "Average: " << avg << endl;
x = [10, 20, 30], what is x[2]?Matrices: Grids of Data
A matrix is a two-dimensional array: rows and columns. Each element is addressed by [row][column].
[1][2].# 3 rows, 3 columns matrix = [ [1, 2, 3], # row 0 [4, 5, 6], # row 1 [7, 8, 9] # row 2 ] print(matrix[0][0]) # 1 (row 0, col 0) print(matrix[1][2]) # 6 (row 1, col 2) print(matrix[2][1]) # 8 (row 2, col 1)
// 3 rows, 3 columns int matrix[3][3] = { {1, 2, 3}, // row 0 {4, 5, 6}, // row 1 {7, 8, 9} // row 2 }; cout << matrix[0][0] << endl; // 1 cout << matrix[1][2] << endl; // 6 cout << matrix[2][1] << endl; // 8
matrix = [[1,2,3],[4,5,6],[7,8,9]] for row in matrix: for val in row: print(val, end=" ") print() # new line after each row
int m[3][3] = {{1,2,3},{4,5,6},{7,8,9}}; for (int r = 0; r < 3; r++) { for (int c = 0; c < 3; c++) { cout << m[r][c] << " "; } cout << endl; }
1 2 34 5 67 8 9Strings in Depth
Strings have built-in methods for searching, slicing, and transforming text. These come up constantly in practical work.
name = "Alice" print(len(name)) # 5 print(name[0]) # A (first char) print(name[4]) # e (last char) print(name[-1]) # e (last char shortcut)
string name = "Alice"; cout << name.length() << endl; // 5 cout << name[0] << endl; // A cout << name[4] << endl; // e
text = "Hello, World!" print(text[0:5]) # Hello (index 0 to 4) print(text[7:]) # World! (index 7 to end) print(text[:5]) # Hello (start to index 4) print(text[-6:]) # World! (last 6 chars)
string text = "Hello, World!"; // substr(start, length) cout << text.substr(0, 5) << endl; // Hello cout << text.substr(7) << endl; // World! cout << text.substr(7, 5) << endl; // World
text[start:end]: the end index is NOT included. So text[0:5] gives characters at positions 0, 1, 2, 3, 4. C++'s substr(start, length) takes a start position and a character count instead.text.upper(). These are identical in concept across both languages.text = " Hello, World! " print(text.upper()) # HELLO, WORLD! print(text.lower()) # hello, world! print(text.strip()) # Hello, World! (no spaces) print(text.replace("World", "Python")) # Hello, Python! words = "one,two,three" print(words.split(",")) # ['one', 'two', 'three'] print("ell" in text) # True (contains check)
#include <algorithm> string text = "Hello, World!"; // to uppercase string up = text; transform(up.begin(), up.end(), up.begin(), toupper); cout << up << endl; // find a substring int pos = text.find("World"); cout << pos << endl; // 7 // check if contains if (text.find("World") != string::npos) cout << "Found!" << endl;
# Joining strings first = "Ada" last = "Lovelace" full = first + " " + last print(full) # Ada Lovelace # Number to string age = 25 msg = "Age: " + str(age) print(msg) # Age: 25 # String to number num = int("42") print(num + 8) # 50
// Joining strings string first = "Ada"; string last = "Lovelace"; string full = first + " " + last; cout << full << endl; // Ada Lovelace // Number to string int age = 25; string msg = "Age: " + to_string(age); cout << msg << endl; // Age: 25 // String to number int num = stoi("42"); cout << num + 8 << endl; // 50
"hello".upper() return?Dictionaries & Maps
A dictionary (Python) or map (C++) stores key-value pairs. Access by key rather than by numeric index.
# Create a dictionary person = { "name": "Alice", "age": 22, "city": "Doha" } # Access by key print(person["name"]) # Alice print(person["age"]) # 22 # Add or update person["email"] = "[email protected]" person["age"] = 23 print(person)
#include <map> map<string, string> person; person["name"] = "Alice"; person["city"] = "Doha"; cout << person["name"] << endl; // Alice cout << person["city"] << endl; // Doha // Map with int values map<string, int> scores; scores["Alice"] = 95; scores["Bob"] = 87;
grades = {
"Alice": 95,
"Bob": 87,
"Clara": 91
}
# Loop through keys and values
for name, score in grades.items():
print(f"{name}: {score}")
# Check if key exists
if "Alice" in grades:
print("Alice is in the list")
map<string, int> grades; grades["Alice"] = 95; grades["Bob"] = 87; grades["Clara"] = 91; // Loop through all pairs for (auto pair : grades) { cout << pair.first << ": " << pair.second << endl; } // Check if key exists if (grades.count("Alice")) cout << "Alice found" << endl;
person = {"name": "Alice", "age": 22}
print(person.keys()) # dict_keys(['name', 'age'])
print(person.values()) # dict_values(['Alice', 22])
print(len(person)) # 2
# Safe access — returns None if key missing
print(person.get("email", "not found"))
# Delete a key
del person["age"]
print(person)
map<string, string> person; person["name"] = "Alice"; person["age"] = "22"; cout << person.size() << endl; // 2 // Delete a key person.erase("age"); cout << person.size() << endl; // 1
d = {"x": 10, "y": 20}, how do you access the value 10?Logical Operators
Logical operators combine or invert boolean conditions. AND, OR, and NOT are the building blocks of all conditional logic.
and, or, not. C++ uses symbols &&, ||, !.| Operator | Python | C++ | Meaning |
|---|---|---|---|
| AND | and | && | Both conditions must be true |
| OR | or | || | At least one must be true |
| NOT | not | ! | Flips true to false, false to true |
age = 20 score = 85 # AND — both must be true if age >= 18 and score >= 80: print("Eligible and high score") # OR — at least one true if age < 13 or age > 65: print("Special discount") # NOT — flip the result logged_in = False if not logged_in: print("Please log in first")
int age = 20; int score = 85; bool loggedIn = false; // AND if (age >= 18 && score >= 80) cout << "Eligible and high score" << endl; // OR if (age < 13 || age > 65) cout << "Special discount" << endl; // NOT if (!loggedIn) cout << "Please log in first" << endl;
| A | B | A and B | A or B | not A |
|---|---|---|---|---|
| True | True | True | True | False |
| True | False | False | True | False |
| False | True | False | True | True |
| False | False | False | False | True |
and, if the first condition is False, Python/C++ skip the second: it can't be true anyway. With or, if the first is True, they skip the second. This matters for performance and avoiding errors.True and False?Loop Control: break & continue
Sometimes you need to stop a loop early, or skip certain iterations. break exits a loop immediately. continue skips the current iteration. Both behave identically in Python and C++.
# Stop as soon as we find 5 for i in range(1, 10): if i == 5: print("Found 5, stopping!") break print(i) # Output: 1 2 3 4 Found 5, stopping!
// Stop as soon as we find 5 for (int i = 1; i < 10; i++) { if (i == 5) { cout << "Found 5, stopping!" << endl; break; } cout << i << endl; } // Output: 1 2 3 4 Found 5, stopping!
# Print only odd numbers (skip evens) for i in range(1, 11): if i % 2 == 0: # if even continue print(i) # Output: 1 3 5 7 9
// Print only odd numbers for (int i = 1; i <= 10; i++) { if (i % 2 == 0) continue; cout << i << endl; } // Output: 1 3 5 7 9
names = ["Alice", "Bob", "Clara", "Dave"] target = "Clara" for name in names: if name == target: print(f"Found {name}!") break print(f"Checked {name}...")
string names[] = {"Alice","Bob","Clara","Dave"}; string target = "Clara"; for (string name : names) { if (name == target) { cout << "Found " << name << "!" << endl; break; } cout << "Checked " << name << endl; }
continue do inside a loop?Error Handling
Errors are inevitable. Error handling lets a program respond to them at runtime rather than crash.
num = int("hello") # CRASH! # ValueError: invalid literal for int() result = 10 / 0 # CRASH! # ZeroDivisionError: division by zero
int result = 10 / 0; // Undefined behaviour int x = stoi("hello"); // throws exception
# Basic try / except try: num = int("hello") print(num) except ValueError: print("That's not a valid number!") # Catching division by zero try: result = 10 / 0 except ZeroDivisionError: print("Cannot divide by zero.") # Catch any error try: x = int(input("Enter a number: ")) print(100 / x) except Exception as e: print(f"Error: {e}")
#include <stdexcept> // Basic try / catch try { int num = stoi("hello"); // throws cout << num << endl; } catch (invalid_argument& e) { cout << "Not a valid number!" << endl; } // Catch any exception try { int x = 0; if (x == 0) throw runtime_error("div by zero"); cout << 10/x << endl; } catch (exception& e) { cout << "Error: " << e.what() << endl; }
finally block runs no matter what: whether an error happened or not. Use it for cleanup tasks like closing files or showing a "done" message.try: num = int("42") print(f"Got: {num}") except ValueError: print("Invalid number") finally: print("This always runs") # Output: # Got: 42 # This always runs
// C++ has no "finally" keyword // but you can simulate it: try { int num = stoi("42"); cout << "Got: " << num << endl; } catch (...) { cout << "Invalid number" << endl; } // Code after try/catch always runs cout << "This always runs" << endl;
Variable Scope
Scope defines where a variable is accessible. A variable declared inside a function does not exist outside it.
name = "Alice" # GLOBAL — visible everywhere def greet(): message = "Hello!" # LOCAL — only inside greet() print(name) # Can access global print(message) # Can access local greet() print(name) # Works — global print(message) # ERROR — message doesn't exist here
string name = "Alice"; // GLOBAL void greet() { string message = "Hello!"; // LOCAL cout << name << endl; // Can access global cout << message << endl; // Can access local } int main() { greet(); cout << name << endl; // Works // cout << message — ERROR return 0; }
global keyword. In C++, global variables can always be read and modified.count = 0 def increment(): global count # must declare this count += 1 increment() increment() increment() print(count) # 3
int count = 0; // global void increment() { count++; // no keyword needed in C++ } int main() { increment(); increment(); increment(); cout << count << endl; // 3 return 0; }
Classes & Objects
A class is a blueprint. An object is an instance of that blueprint. This is the foundation of object-oriented programming.
Person class.class Person: def __init__(self, name, age): self.name = name # attribute self.age = age def greet(self): # method print(f"Hi, I'm {self.name}, {self.age}.") def birthday(self): self.age += 1 print(f"Happy birthday {self.name}! Now {self.age}.") # Create objects from the class alice = Person("Alice", 22) bob = Person("Bob", 30) alice.greet() bob.greet() alice.birthday() print(alice.age) # 23
class Person { public: string name; int age; // Constructor Person(string n, int a) { name = n; age = a; } void greet() { cout << "Hi, I'm " << name << ", " << age << "." << endl; } void birthday() { age++; cout << "Happy birthday " << name << "! Now " << age << endl; } }; int main() { Person alice("Alice", 22); Person bob("Bob", 30); alice.greet(); bob.greet(); alice.birthday(); return 0; }
| Term | Meaning | Example |
|---|---|---|
| Class | The blueprint / template | class Person: |
| Object / Instance | A specific thing built from the class | alice = Person("Alice", 22) |
| Attribute | Data stored in the object | self.name, self.age |
| Method | A function that belongs to the class | def greet(self): |
| Constructor | Special method that runs when object is created | __init__ (Python) / ClassName() (C++) |
| self / this | Refers to the current object | self.name (Python), this->name (C++) |
class BankAccount: def __init__(self, owner, balance=0): self.owner = owner self.balance = balance def deposit(self, amount): self.balance += amount print(f"Deposited {amount}. Balance: {self.balance}") def withdraw(self, amount): if amount > self.balance: print("Insufficient funds.") else: self.balance -= amount print(f"Withdrew {amount}. Balance: {self.balance}") acc = BankAccount("Alice", 100) acc.deposit(50) acc.withdraw(30) acc.withdraw(200)
class BankAccount { public: string owner; double balance; BankAccount(string o, double b) { owner = o; balance = b; } void deposit(double amt) { balance += amt; cout << "Balance: " << balance << endl; } void withdraw(double amt) { if (amt > balance) cout << "Insufficient funds." << endl; else { balance -= amt; cout << "Balance: " << balance << endl; } } };
Math & Libraries
Libraries are collections of pre-written code you import and use. The standard math library covers roots, rounding, and trigonometry.
import math. In C++ you include it with #include <cmath>.import math print(math.sqrt(16)) # 4.0 print(math.sqrt(2)) # 1.4142... print(math.floor(3.7)) # 3 (round down) print(math.ceil(3.2)) # 4 (round up) print(round(3.567, 2)) # 3.57 print(abs(-15)) # 15 print(math.pow(2, 8)) # 256.0 print(math.pi) # 3.14159...
#include <cmath> cout << sqrt(16) << endl; // 4 cout << sqrt(2) << endl; // 1.4142 cout << floor(3.7) << endl; // 3 cout << ceil(3.2) << endl; // 4 cout << round(3.567) << endl; // 4 cout << abs(-15) << endl; // 15 cout << pow(2, 8) << endl; // 256 cout << M_PI << endl; // 3.14159
import random # Random integer between 1 and 10 n = random.randint(1, 10) print(n) # Random float between 0 and 1 f = random.random() print(f) # Pick a random item from a list colours = ["red", "blue", "green"] print(random.choice(colours)) # Shuffle a list random.shuffle(colours) print(colours)
#include <cstdlib> #include <ctime> // Seed the random generator srand(time(0)); // Random int 1–10 int n = rand() % 10 + 1; cout << n << endl; // Random float 0–1 double f = (double)rand() / RAND_MAX; cout << f << endl;
math module and use math.sqrt(). There are libraries for web requests, databases, plotting, machine learning, image processing. Using them is a core part of practical programming.| Python Library | What it does |
|---|---|
| math | Square roots, trigonometry, logarithms |
| random | Random numbers, shuffling, picking |
| datetime | Dates, times, time differences |
| os | File system operations |
| json | Read and write JSON data |
| requests | Make HTTP requests to the internet |
| numpy | Fast maths on large arrays (scientific) |
| pandas | Work with tables of data |
Python for Mechanical & Aerospace Engineers
Python covers the same ground as MATLAB and Excel: with better automation and no licence cost. This module covers the specific tools aerospace and mechanical engineers use in practice.
| Typical engineering task | Excel / MATLAB | Python |
|---|---|---|
| Process 10,000 rows of sensor data | Excel slows down, crashes | Loads in under a second with pandas |
| Plot a stress-strain curve with annotations | Excel chart: limited control | Full control with matplotlib in 15 lines |
| Fit a regression line to test data | Excel trendline: no equations output | numpy.polyfit gives slope, intercept, R² |
| Solve a system of equations (stiffness matrix) | MATLAB: costs thousands per licence | numpy.linalg.solve: free, same syntax |
| Run the same analysis on 50 test files | Open each one manually | Loop through all files in 10 lines |
| Share your analysis method | Send the file and hope formatting survives | Send the script: identical result every time |
| Library | What it does | Engineering use case |
|---|---|---|
| numpy | Fast maths on arrays: vectors, matrices, trig, linear algebra | Stress calculations, coordinate transforms, solving Ax = b, FEA pre/post |
| pandas | Load, inspect, filter, and transform tables of data | Sensor logs, fatigue test results, wind tunnel data, material databases |
| matplotlib | Plot any graph: line, scatter, bar, contour, polar | Stress-strain curves, lift polars, trajectory plots, temperature maps |
| scipy | Scientific algorithms: signal processing, curve fitting, integration, ODE solving | Vibration analysis, aerodynamic data fitting, flight path integration |
pip install numpy pandas matplotlib scipy openpyxl. That's your complete engineering toolkit: the same capabilities as a MATLAB installation, without the licence.import numpy as np import pandas as pd import matplotlib.pyplot as plt # 1. Load test data from the testing machine CSV export df = pd.read_csv("al6061_tensile.csv") # columns: strain_pct, stress_MPa # 2. Find the 0.2% proof stress (yield strength) offset_line = 0.2 + df["strain_pct"] * 0 # horizontal offset elastic = 68900 * (df["strain_pct"] - 0.2) / 100 # E = 68.9 GPa # 3. Compute basic stats uts = df["stress_MPa"].max() print(f"UTS: {uts:.1f} MPa") print(f"Fracture strain: {df['strain_pct'].iloc[-1]:.2f}%") # 4. Plot the stress-strain curve plt.figure(figsize=(9, 6)) plt.plot(df["strain_pct"], df["stress_MPa"], color="steelblue", linewidth=2, label="Al 6061-T6") plt.axhline(uts, color="red", linestyle="--", label=f"UTS = {uts:.0f} MPa") plt.xlabel("Strain (%)"); plt.ylabel("Stress (MPa)") plt.title("Tensile Test — Al 6061-T6") plt.legend(); plt.grid(True, alpha=0.3) plt.savefig("stress_strain.png", dpi=300) plt.show()
NumPy: Engineering Maths at Scale
NumPy is the numerical foundation of scientific Python. It operates on entire arrays at once: the same vectorised approach used in MATLAB, without the licence.
import numpy as np # Create arrays from data stress = np.array([0, 50, 100, 150, 200, 250]) # MPa strain = np.array([0, 0.07, 0.14, 0.22, 0.29, 0.36]) # % # Maths on every element at once — no loop stress_psi = stress * 145.038 # convert MPa → psi (all 6 values at once) strain_dec = strain / 100 # percent → decimal # Generate engineering sequences angles = np.linspace(0, 360, 361) # 0° to 360°, every 1° time = np.linspace(0, 10, 1000) # 0 to 10 seconds, 1000 points thicknesses = np.arange(1, 20, 0.5) # 1mm to 19.5mm, step 0.5mm # Array properties print(stress.shape) # (6,) — 1D array with 6 elements print(stress.size) # 6 print(stress.dtype) # int64
import numpy as np # Trig — angles in radians (convert first) aoa_deg = np.array([0, 2, 4, 6, 8, 10, 12]) # angle of attack, degrees aoa_rad = np.radians(aoa_deg) # Thin aerofoil theory: CL ≈ 2π·sin(α) (linearised) CL = 2 * np.pi * np.sin(aoa_rad) print("CL values:", np.round(CL, 3)) # Decompose a velocity vector into components V_total = 250 # m/s (TAS) gamma = 15 # flight path angle, degrees Vx = V_total * np.cos(np.radians(gamma)) # horizontal component Vz = V_total * np.sin(np.radians(gamma)) # vertical component (climb) print(f"Vx = {Vx:.1f} m/s, Vz = {Vz:.1f} m/s") # 2D rotation matrix — rotate a force vector by θ theta = np.radians(30) R = np.array([[np.cos(theta), -np.sin(theta)], [np.sin(theta), np.cos(theta)]]) F = np.array([1000, 0]) # 1000 N in x direction F_rotated = R @ F # @ is matrix multiply print(f"Rotated force: {F_rotated.round(1)} N")
x = A\b.import numpy as np ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ # Example: 3-bar truss — solve for nodal displacements # K·u = F → stiffness matrix × displacements = forces ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ K = np.array([[ 3, -1, 0], [-1, 2, -1], [ 0, -1, 1]], dtype=float) # stiffness matrix F = np.array([0, 10000, 0]) # applied forces, N # Solve — equivalent to MATLAB's u = K\F u = np.linalg.solve(K, F) print("Displacements (mm):", u.round(4)) # Matrix properties engineers need print(f"Determinant: {np.linalg.det(K):.2f}") print(f"Condition no.: {np.linalg.cond(K):.2f}") # ill-conditioned if large eigenvalues, eigenvectors = np.linalg.eig(K) print("Natural frequencies (proportional):", np.sqrt(eigenvalues).round(3))
import numpy as np # ISA constants T0 = 288.15 # sea level temperature, K P0 = 101325 # sea level pressure, Pa rho0 = 1.225 # sea level density, kg/m³ L = 0.0065 # lapse rate, K/m g = 9.80665 # gravitational acceleration R = 287.05 # specific gas constant, J/(kg·K) # Altitude range: sea level to 11 km (troposphere), 100 points h = np.linspace(0, 11000, 100) # metres # ISA equations — applied to entire array at once T = T0 - L * h # temperature, K P = P0 * (T / T0) ** (g / (L * R)) # pressure, Pa rho = P / (R * T) # density, kg/m³ a = np.sqrt(1.4 * R * T) # speed of sound, m/s # Print at specific altitudes for alt in [0, 3000, 6000, 9000, 11000]: idx = np.argmin(np.abs(h - alt)) # find nearest index print(f"h={alt:5d}m T={T[idx]:.1f}K rho={rho[idx]:.4f}kg/m³ a={a[idx]:.1f}m/s")
s = np.array([100, 200, 300]). How do you convert them all to psi (1 MPa = 145.038 psi) in one line?Statistics for Engineers
Test data always has scatter. This lesson covers the statistical tools for characterising distributions, fitting regression lines, and quantifying uncertainty.
import numpy as np # UTS results from 20 tensile specimens of Ti-6Al-4V (MPa) uts = np.array([940, 952, 938, 961, 945, 958, 933, 970, 948, 955, 942, 964, 950, 937, 956, 947, 962, 943, 959, 951]) mean = np.mean(uts) std = np.std(uts, ddof=1) # ddof=1 for sample std dev cv = (std / mean) * 100 # coefficient of variation, % se = std / np.sqrt(len(uts)) # standard error print(f"Mean UTS: {mean:.1f} MPa") print(f"Std deviation: {std:.1f} MPa") print(f"Coefficient of var: {cv:.2f}%") print(f"Min / Max: {uts.min()} / {uts.max()} MPa") print(f"95% CI (approx): {mean:.1f} ± {2*se:.1f} MPa") # Spot outliers: values beyond 2 standard deviations outliers = uts[np.abs(uts - mean) > 2 * std] print(f"Outliers: {outliers}")
ddof=1 for sample standard deviation (when your data is a sample from a larger population: which it almost always is in testing). Use ddof=0 only when you have the entire population. Getting this wrong is a common mistake in engineering reports.polyfit, which also gives you R² to quantify how good the fit is.import numpy as np import matplotlib.pyplot as plt # Fatigue data: stress amplitude (MPa) vs log10(cycles to failure) stress = np.array([400, 350, 300, 280, 260, 240, 220, 200]) log_N = np.array([3.8, 4.2, 4.9, 5.3, 5.7, 6.1, 6.6, 7.0]) # Linear regression: fit a straight line y = m*x + c coeffs = np.polyfit(log_N, stress, 1) # degree 1 = linear m, c = coeffs print(f"Slope (m): {m:.2f}") print(f"Intercept (c): {c:.2f}") print(f"Equation: stress = {m:.1f} * log10(N) + {c:.1f}") # Calculate R² (coefficient of determination) predicted = np.polyval(coeffs, log_N) ss_res = np.sum((stress - predicted) ** 2) ss_tot = np.sum((stress - stress.mean()) ** 2) r_squared = 1 - (ss_res / ss_tot) print(f"R² = {r_squared:.4f}") # closer to 1.0 = better fit # Predict: what stress gives 10^6 cycles? stress_at_1M = np.polyval(coeffs, 6.0) print(f"Fatigue limit (10^6 cycles): {stress_at_1M:.1f} MPa") # Plot fit_line = np.linspace(3.5, 7.5, 100) plt.scatter(log_N, stress, color="steelblue", s=60, label="Test data", zorder=5) plt.plot(fit_line, np.polyval(coeffs, fit_line), "r--", label=f"Linear fit R²={r_squared:.3f}") plt.xlabel("log₁₀(N)"); plt.ylabel("Stress amplitude (MPa)") plt.title("S-N Curve — Al 2024-T3") plt.legend(); plt.grid(True, alpha=0.3) plt.show()
polyfit handles these too.import numpy as np import matplotlib.pyplot as plt # Experimental lift coefficient data (clean wing, subsonic) alpha = np.array([-4, -2, 0, 2, 4, 6, 8, 10, 12]) # AoA, deg CL = np.array([-0.28, -0.06, 0.18, 0.42, 0.65, 0.88, 1.05, 1.18, 1.22]) # Fit a 2nd-degree polynomial (captures the slight nonlinearity) coeffs = np.polyfit(alpha, CL, 2) print(f"CL = {coeffs[0]:.5f}·α² + {coeffs[1]:.4f}·α + {coeffs[2]:.4f}") # Lift slope dCL/dα at α=0° (per degree) lift_slope = 2 * coeffs[0] * 0 + coeffs[1] print(f"Lift slope: {lift_slope:.4f} /deg ({lift_slope * 57.3:.3f} /rad)") # Zero-lift angle zero_lift_alpha = np.roots(coeffs)[1] # find root of polynomial print(f"Zero-lift angle: {zero_lift_alpha:.2f}°") # Plot the fit against the data alpha_smooth = np.linspace(-5, 14, 200) CL_fit = np.polyval(coeffs, alpha_smooth) plt.scatter(alpha, CL, color="steelblue", s=60, zorder=5, label="Wind tunnel data") plt.plot(alpha_smooth, CL_fit, "r-", linewidth=2, label="Quadratic fit") plt.axhline(0, color="gray", lw=0.5) plt.axvline(0, color="gray", lw=0.5) plt.xlabel("Angle of Attack α (°)") plt.ylabel("Lift Coefficient CL") plt.title("Lift Polar — Clean Wing") plt.legend(); plt.grid(True, alpha=0.3) plt.show()
Pandas: Analysing Real Test Data
Pandas provides a DataFrame: a table structure that reads directly from CSV and Excel exports. Filter, transform, aggregate, and export: all in code that runs identically every time.
import pandas as pd # Load CSV exported from test rig or DAS (data acquisition system) df = pd.read_csv("wing_fatigue_tests.csv") # Or load from Excel directly df = pd.read_excel("material_test_results.xlsx", sheet_name="Fatigue") # First look — always do this before any analysis print(df.head(5)) # first 5 rows print(df.shape) # (rows, columns) print(df.columns.tolist()) # column names print(df.dtypes) # data types — check nothing read as text print(df.isnull().sum()) # missing values per column print(df.describe()) # count, mean, std, min, 25%, 50%, 75%, max
# Select a single column cycles = df["cycles_to_failure"] # Filter rows: only specimens that failed (not run-outs) failed = df[df["status"] == "FAILURE"] runouts = df[df["status"] == "RUNOUT"] print(f"Failures: {len(failed)}, Run-outs: {len(runouts)}") # Filter: specific material AND stress range al_high = df[ (df["material"] == "Al2024-T3") & (df["stress_MPa"] >= 200) & (df["stress_MPa"] <= 350) ] # Filter to several materials at once metals = df[df["material"].isin(["Al2024-T3", "Ti6Al4V", "Steel4340"])] # Sort by cycles descending df_sorted = df.sort_values("cycles_to_failure", ascending=False)
# Compute stress from raw load and cross-section area df["stress_MPa"] = df["load_N"] / df["area_mm2"] # Compute engineering strain from displacement and gauge length df["strain_pct"] = (df["disp_mm"] / df["gauge_mm"]) * 100 # Flag specimens that exceeded design limit df["overload"] = df["stress_MPa"] > 250 # Mean stress and stress ratio for fatigue df["stress_mean"] = (df["stress_max"] + df["stress_min"]) / 2 df["stress_ratio"] = df["stress_min"] / df["stress_max"] # Group by material — get mean, std, count per group (like a pivot table) summary = df.groupby("material").agg( mean_UTS = ("stress_MPa", "mean"), std_UTS = ("stress_MPa", "std"), n_specimens= ("stress_MPa", "count") ).round(2) print(summary)
import pandas as pd import numpy as np # 1. Load raw test data df = pd.read_csv("fatigue_database.csv") print(f"Loaded {len(df)} records.") # 2. Clean — remove rows with missing critical values df = df.dropna(subset=["stress_MPa", "cycles_to_failure"]) # 3. Compute log cycles (for S-N plot) df["log_N"] = np.log10(df["cycles_to_failure"]) # 4. Filter to material of interest mat = df[df["material"] == "Al2024-T3"] # 5. Linear regression on S-N data coeffs = np.polyfit(mat["log_N"], mat["stress_MPa"], 1) print(f"Fatigue limit (10^7): {np.polyval(coeffs, 7):.1f} MPa") # 6. Export processed results mat.to_csv("al2024_processed.csv", index=False) print("Saved al2024_processed.csv")
Matplotlib: Engineering Plots
Matplotlib produces publication-quality plots with full control over every element: axes, labels, annotations, line styles. No menu-clicking required.
import matplotlib.pyplot as plt import numpy as np # Simulated tensile test data — Al 6061-T6 strain = np.array([0, 0.1, 0.2, 0.3, 0.45, 0.7, 1.0, 1.5, 2.0, 3.0, 4.5, 6.0, 8.0]) stress = np.array([0, 69, 138, 207, 270, 290, 300, 310, 318, 325, 310, 285, 240]) uts_idx = np.argmax(stress) uts_val = stress[uts_idx] uts_strain= strain[uts_idx] fig, ax = plt.subplots(figsize=(9, 6)) # Main curve ax.plot(strain, stress, color="#2563EB", linewidth=2.5, label="Al 6061-T6") # Elastic modulus line (first two points) ax.plot([0, 0.3], [0, 68.9*0.003*1000], "k--", lw=1, label="Elastic region") # Annotate UTS ax.annotate(f"UTS = {uts_val} MPa", xy=(uts_strain, uts_val), xytext=(uts_strain - 1.5, uts_val - 25), arrowprops=dict(arrowstyle="->", color="red"), color="red", fontsize=10) ax.set_xlabel("Engineering Strain (%)", fontsize=12) ax.set_ylabel("Engineering Stress (MPa)", fontsize=12) ax.set_title("Tensile Test — Al 6061-T6", fontsize=14) ax.legend(fontsize=10) ax.grid(True, alpha=0.3) plt.tight_layout() plt.savefig("stress_strain.png", dpi=300) plt.show()
import matplotlib.pyplot as plt import numpy as np t = np.linspace(0, 120, 1200) # 0–120 s at 10 Hz alt = 1000 * (t/120)**0.5 # altitude climb spd = 60 + 80 * (t/120) # speed ramp aoa = 8 * np.exp(-t/30) + 2 # AoA reducing after rotation nz = 1 + 0.3 * np.sin(2*np.pi*t/15) # normal load factor fig, axes = plt.subplots(4, 1, figsize=(11, 10), sharex=True) fig.suptitle("Flight Test Data — Climb Segment", fontsize=14, y=0.98) data = [alt, spd, aoa, nz] labels = ["Altitude (m)", "Airspeed (m/s)", "AoA (°)", "Load Factor Nz"] colors = ["#2563EB", "#16A34A", "#DC2626", "#9333EA"] for ax, d, lbl, col in zip(axes, data, labels, colors): ax.plot(t, d, color=col, linewidth=1.5) ax.set_ylabel(lbl, fontsize=10) ax.grid(True, alpha=0.25) axes[3].axhline(2.5, color="red", ls="--", lw=1, label="Limit load") axes[3].legend(fontsize=9) axes[3].set_xlabel("Time (s)", fontsize=11) plt.tight_layout() plt.savefig("flight_data.png", dpi=300) plt.show()
import matplotlib.pyplot as plt import numpy as np # Simulated dataset: fatigue tests at different temperatures np.random.seed(42) n = 80 stress = np.random.uniform(150, 400, n) log_N = 8.5 - 0.012 * stress + np.random.normal(0, 0.2, n) temp = np.random.uniform(20, 300, n) # temperature, °C fig, ax = plt.subplots(figsize=(9, 6)) sc = ax.scatter(log_N, stress, c=temp, cmap="plasma", s=50, alpha=0.8) cbar = plt.colorbar(sc, ax=ax) cbar.set_label("Temperature (°C)", fontsize=11) ax.set_xlabel("log₁₀(N) — Cycles to Failure", fontsize=12) ax.set_ylabel("Stress Amplitude (MPa)", fontsize=12) ax.set_title("S-N Data Coloured by Test Temperature", fontsize=13) ax.grid(True, alpha=0.25) plt.tight_layout() plt.savefig("sn_temperature.png", dpi=300) plt.show()
Automating Excel Reports
Excel remains the standard for sharing results in most engineering organisations. Python automates the process: read the data, run the analysis, write the report. The same script handles every new dataset.
import pandas as pd # Read first sheet (default) df = pd.read_excel("test_results.xlsx") # Read a named sheet df = pd.read_excel("test_results.xlsx", sheet_name="Fatigue") # Read all sheets at once (returns a dict) all_sheets = pd.read_excel("test_results.xlsx", sheet_name=None) for name, sheet_df in all_sheets.items(): print(f"Sheet '{name}': {len(sheet_df)} rows") # Skip metadata rows at the top (common in lab exports) df = pd.read_excel("test_results.xlsx", skiprows=4, header=0) # Read only specific columns to save memory df = pd.read_excel("test_results.xlsx", usecols=["specimen_id", "material", "UTS_MPa", "cycles"])
import pandas as pd import numpy as np # Your analysed data raw_df = pd.read_csv("fatigue_tests.csv") raw_df["log_N"] = np.log10(raw_df["cycles_to_failure"]) # Summary statistics per material summary = raw_df.groupby("material").agg( n_specimens = ("UTS_MPa", "count"), mean_UTS = ("UTS_MPa", "mean"), std_UTS = ("UTS_MPa", "std"), min_cycles = ("cycles_to_failure", "min"), max_cycles = ("cycles_to_failure", "max") ).round(2) # Write to Excel with multiple sheets with pd.ExcelWriter("fatigue_report.xlsx", engine="openpyxl") as writer: raw_df.to_excel(writer, sheet_name="Raw Data", index=False) summary.to_excel(writer, sheet_name="Summary", index=True) print("Saved fatigue_report.xlsx with 2 sheets.")
import pandas as pd import numpy as np import glob import os # Find every .xlsx file in the test_data folder files = glob.glob("test_data/*.xlsx") print(f"Found {len(files)} test files.") results = [] for filepath in files: specimen_id = os.path.basename(filepath).replace(".xlsx", "") # Load each file df = pd.read_excel(filepath) # Compute stress from raw columns df["stress_MPa"] = df["load_N"] / df["area_mm2"] # Extract summary for this specimen row = { "specimen": specimen_id, "material": df["material"].iloc[0], "UTS_MPa": df["stress_MPa"].max().round(1), "E_GPa": (df["stress_MPa"].iloc[5] / df["strain_pct"].iloc[5] * 100).round(1), "elong_pct": df["strain_pct"].iloc[-1] } results.append(row) print(f" Processed {specimen_id} → UTS = {row['UTS_MPa']} MPa") # Compile all results report = pd.DataFrame(results) print("\n=== BATCH SUMMARY ===") print(report) # Append overall stats to the bottom report.to_excel("batch_summary.xlsx", index=False) print("\nSaved batch_summary.xlsx")
| Engineering task | Excel | Python |
|---|---|---|
| Open a file | Double-click | pd.read_csv() / pd.read_excel() |
| Summary stats (all columns) | =AVERAGE, =STDEV, etc. | df.describe() |
| Filter rows by condition | Data › AutoFilter | df[df["col"] > value] |
| Add computed column | Write formula, drag down | df["new"] = df["a"] / df["b"] |
| Group statistics (pivot table) | Insert › PivotTable | df.groupby("mat")["val"].mean() |
| Sort data | Data › Sort | df.sort_values("col") |
| Remove duplicates | Data › Remove Duplicates | df.drop_duplicates() |
| Count missing values | Manually scan | df.isnull().sum() |
| Linear regression | Add trendline to chart | np.polyfit(x, y, 1) |
| Solve Ax = b | MATLAB / manual | np.linalg.solve(A, b) |
| Plot stress-strain curve | Insert Chart (8 clicks) | plt.plot(strain, stress) |
| Save plot for report | Right-click › Save as image | plt.savefig("plot.png", dpi=300) |
| Process 50 test files | Open each one manually | for f in glob("*.xlsx"): ... |
SciPy: The Engineering Toolkit
SciPy provides the scientific algorithms that sit on top of NumPy: FFT, curve fitting, ODE solvers, and signal filtering. The tools that make Python genuinely useful for engineering work.
import numpy as np from scipy import fft import matplotlib.pyplot as plt # Simulate accelerometer data: 50 Hz + 120 Hz signal + noise fs = 1000 # sample rate, Hz t = np.linspace(0, 1, fs, endpoint=False) # 1 second sig = (np.sin(2*np.pi*50*t) # 50 Hz component + 0.5*np.sin(2*np.pi*120*t) # 120 Hz component + 0.3*np.random.randn(fs)) # noise # Compute FFT N = len(sig) yf = fft.fft(sig) xf = fft.fftfreq(N, 1/fs) # frequency axis power = 2/N * np.abs(yf[:N//2]) # one-sided power spectrum freqs = xf[:N//2] # Find dominant frequency peak_freq = freqs[np.argmax(power)] print(f"Dominant frequency: {peak_freq:.1f} Hz") # Plot spectrum plt.figure(figsize=(10, 4)) plt.plot(freqs, power, color="steelblue", lw=1.2) plt.xlabel("Frequency (Hz)"); plt.ylabel("Amplitude") plt.title("FFT Power Spectrum"); plt.grid(True, alpha=0.3) plt.xlim(0, 300) plt.show()
scipy.optimize.curve_fit is more powerful than np.polyfit: you define the equation shape yourself. Exponential decay, power law, sine wave, anything. It returns the best-fit parameters and their uncertainties.import numpy as np from scipy.optimize import curve_fit import matplotlib.pyplot as plt # Define the equation to fit — power law: y = a * x^b def power_law(x, a, b): return a * x ** b # Creep strain data: time (hours) vs strain (%) time = np.array([1, 5, 10, 50, 100, 500, 1000]) strain = np.array([0.12, 0.21, 0.28, 0.48, 0.61, 1.05, 1.32]) # Fit — popt = best parameters, pcov = covariance (uncertainty) popt, pcov = curve_fit(power_law, time, strain) a, b = popt perr = np.sqrt(np.diag(pcov)) # 1-sigma uncertainty print(f"a = {a:.4f} ± {perr[0]:.4f}") print(f"b = {b:.4f} ± {perr[1]:.4f}") print(f"Equation: strain = {a:.4f} * t^{b:.4f}") # Predict strain at 2000 hours print(f"Predicted strain at 2000h: {power_law(2000, *popt):.3f}%") # Plot t_smooth = np.logspace(0, 4, 200) plt.scatter(time, strain, s=60, zorder=5, label="Data") plt.plot(t_smooth, power_law(t_smooth, *popt), "r-", label=f"Fit: {a:.3f}·t^{b:.3f}") plt.xscale("log"); plt.yscale("log") plt.xlabel("Time (h)"); plt.ylabel("Strain (%)") plt.title("Creep Curve — Power Law Fit") plt.legend(); plt.grid(True, alpha=0.3) plt.show()
scipy.integrate.solve_ivp is the modern way to solve them numerically.import numpy as np from scipy.integrate import solve_ivp import matplotlib.pyplot as plt # Constants g0 = 9.80665 # m/s² m0 = 10000 # initial mass, kg mdot = 50 # mass flow rate, kg/s Isp = 300 # specific impulse, s ve = Isp * g0 # exhaust velocity, m/s Cd = 0.3 # drag coefficient A = 1.0 # reference area, m² rho0 = 1.225 # sea level density, kg/m³ def rocket_eom(t, y): """State vector: y = [altitude, velocity, mass]""" h, v, m = y rho = rho0 * np.exp(-h / 8500) # exponential atmosphere drag = 0.5 * rho * v**2 * Cd * A thrust = mdot * ve if m > m0 - mdot*60 else 0 # burn for 60 s dhdt = v dvdt = (thrust - drag) / m - g0 dmdt = -mdot if thrust > 0 else 0 return [dhdt, dvdt, dmdt] # Solve from t=0 to t=120s sol = solve_ivp(rocket_eom, [0, 120], [0, 0, m0], max_step=0.5, dense_output=True) t = sol.t h = sol.y[0] / 1000 # altitude, km v = sol.y[1] # velocity, m/s print(f"Max altitude: {h.max():.2f} km") print(f"Max velocity: {v.max():.1f} m/s") fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 4)) ax1.plot(t, h, color="#2563EB"); ax1.set_ylabel("Altitude (km)") ax2.plot(t, v, color="#DC2626"); ax2.set_ylabel("Velocity (m/s)") for ax in (ax1, ax2): ax.set_xlabel("Time (s)"); ax.grid(True, alpha=0.3) plt.suptitle("Rocket Trajectory"); plt.tight_layout() plt.show()
import numpy as np from scipy import signal import matplotlib.pyplot as plt fs = 500 # sample rate, Hz t = np.linspace(0, 2, 2*fs) # 2 seconds clean = np.sin(2*np.pi*5*t) # 5 Hz true signal noisy = clean + 0.5*np.random.randn(len(t)) # Design a 4th-order Butterworth low-pass filter, cutoff = 20 Hz b, a = signal.butter(4, 20, btype="low", fs=fs) filtered = signal.filtfilt(b, a, noisy) # zero-phase filtering plt.figure(figsize=(10, 4)) plt.plot(t, noisy, alpha=0.4, label="Noisy", lw=1) plt.plot(t, filtered, color="#DC2626", label="Filtered", lw=2) plt.xlabel("Time (s)"); plt.ylabel("Signal") plt.title("Butterworth Low-Pass Filter") plt.legend(); plt.grid(True, alpha=0.3) plt.show()
scipy.integrate.quad(f, a, b): definite integral of function f from a to bscipy.optimize.minimize(f, x0): minimise a scalar function (e.g. drag)scipy.optimize.brentq(f, a, b): find root of f in [a,b] (e.g. zero-lift angle)scipy.interpolate.interp1d(x, y): interpolate between data pointsscipy.stats.ttest_ind(a, b): t-test: are two datasets statistically different?scipy.stats.norm.ppf(0.95): 95th percentile of normal distribution
Debugging & Profiling
Debugging and profiling are as important as writing new code. This section covers how to find errors fast and measure where time is actually being spent.
import numpy as np # BUG 1: Integer division gives wrong answer silently E = 200000 # MPa eps = 3 # strain — but these are both ints! stress = E * eps / 1000 # Python 3: gives 0.6 ✓ # In Python 2 or C++: 200000 * 3 / 1000 = 600, not 0.6 # In C++: int/int = int — use 1000.0 to force float division # BUG 2: Comparing floats with == (almost always wrong) a = 0.1 + 0.2 print(a == 0.3) # False! (floating point) print(np.isclose(a, 0.3)) # True ← use this instead # BUG 3: Mutating a list you think you're copying original = [1, 2, 3] copy = original # NOT a copy — same object! copy[0] = 99 print(original) # [99, 2, 3] — original changed! # Fix: copy = original.copy() # or list(original) or original[:] # NumPy: arr2 = arr.copy() # NOT arr2 = arr # BUG 4: Degrees vs radians (kills calculations silently) angle = 45 # degrees print(np.sin(angle)) # 0.8509 — WRONG, treats as radians print(np.sin(np.radians(45))) # 0.7071 — correct # BUG 5: NaN propagation — one bad value corrupts everything data = np.array([1.0, 2.0, np.nan, 4.0]) print(np.mean(data)) # nan — mean fails silently print(np.nanmean(data)) # 2.333 — use nan-safe versions print(np.any(np.isnan(data))) # True — always check your data
import time import numpy as np # Simple timing with perf_counter (high resolution) start = time.perf_counter() # --- your code here --- result = np.sum(np.random.randn(1_000_000)) elapsed = time.perf_counter() - start print(f"Elapsed: {elapsed*1000:.2f} ms") # --- Python loop vs NumPy --- why it matters data = np.random.randn(100_000) # Slow: Python loop t0 = time.perf_counter() total = 0 for x in data: total += x ** 2 print(f"Loop: {(time.perf_counter()-t0)*1000:.1f} ms") # Fast: NumPy vectorised t0 = time.perf_counter() total = np.sum(data ** 2) print(f"NumPy: {(time.perf_counter()-t0)*1000:.1f} ms") # NumPy is typically 50-200x faster on large arrays # Full profiling with cProfile (run from terminal) # python -m cProfile -s cumulative your_script.py
import numpy as np def compute_stress(force_N, area_mm2): # Guard against obviously wrong inputs assert force_N >= 0, "Force must be non-negative" assert area_mm2 > 0, "Area must be positive" stress = force_N / area_mm2 assert stress < 2000, f"Stress {stress:.1f} MPa exceeds material limit" return stress # Validate array shapes before matrix operations A = np.ones((3, 3)) b = np.ones(3) assert A.shape[1] == b.shape[0], "Matrix dimensions must match" # Check no NaN crept in after operations result = np.linalg.solve(A, b) assert not np.any(np.isnan(result)), "NaN in result — check input matrix" print("All checks passed.")
| Error | What it means | How to fix it |
|---|---|---|
| NameError: name 'x' is not defined | You used a variable before creating it, or misspelled it | Check spelling. Check you assigned it before using it. |
| TypeError: unsupported operand type | Maths on incompatible types: e.g. int + string | Convert types: int(x), float(x), str(x) |
| ValueError: could not convert string to float | Tried to convert "abc" or "" to a number | Check your data for non-numeric values in numeric columns |
| IndexError: list index out of range | Accessed index that doesn't exist: e.g. list[5] on a 3-item list | Check length first: len(x). Remember index starts at 0. |
| KeyError: 'column_name' | Column doesn't exist in DataFrame | Print df.columns to see actual names. Check for spaces. |
| LinAlgError: Singular matrix | Matrix has no inverse: system has no unique solution | Check your stiffness matrix for unconstrained DOFs |
| shapes (3,) and (4,) not aligned | NumPy matrix dimension mismatch | Print .shape on both arrays before the operation |
| segmentation fault (C++) | Accessed memory you don't own: usually array out of bounds | Check all array indices. Use at() instead of [] for bounds checking. |
| undefined reference to (C++) | Function declared but not defined, or missing library link | Add the implementation, or link the library: -lm, -lopencv_core |
Units, Constants & Pitfalls
Unit errors are silent and catastrophic. The Mars Climate Orbiter is the most expensive example. This section covers how to handle units correctly in Python.
from scipy import constants as c import numpy as np # Physical constants (exact scipy values) print(c.g) # 9.80665 m/s² standard gravity print(c.R) # 8.314... J/(mol·K) universal gas constant print(c.atm) # 101325 Pa standard atmosphere print(c.speed_of_light) # 299792458 m/s print(c.pi) # same as np.pi # Aerospace / mechanical constants — define these at top of every script g0 = 9.80665 # m/s² — standard gravity (ISA) R_air = 287.058 # J/(kg·K) — specific gas constant for dry air gamma = 1.4 # — ratio of specific heats for air P_sl = 101325.0 # Pa — sea level pressure (ISA) T_sl = 288.15 # K — sea level temperature (ISA) rho_sl= 1.225 # kg/m³ — sea level density (ISA) a_sl = np.sqrt(gamma * R_air * T_sl) # 340.3 m/s — speed of sound # Unit conversions — name them clearly MPa_to_psi = 145.038 psi_to_MPa = 1 / MPa_to_psi kts_to_ms = 0.514444 ms_to_kts = 1 / kts_to_ms ft_to_m = 0.3048 m_to_ft = 1 / ft_to_m lbf_to_N = 4.44822 deg_to_rad = np.pi / 180 # or use np.radians()
3.141592653589793 in an engineering report is wrong. Knowing how to format numbers: significant figures, scientific notation, fixed decimal places: is part of professional engineering output.# Fixed decimal places stress = 247.38291 print(f"{stress:.2f}") # 247.38 (2 decimal places) print(f"{stress:.0f}") # 247 (round to integer) # Significant figures val = 0.0001234567 print(f"{val:.3g}") # 0.000123 (3 sig figs) print(f"{val:.4g}") # 0.0001235 # Scientific notation Re = 4_500_000 print(f"{Re:.2e}") # 4.50e+06 print(f"{Re:,.0f}") # 4,500,000 (thousands separator) # Width padding — align columns data = [("Steel", 400, 210), ("Aluminium", 270, 69), ("CFRP", 600, 150)] print(f"{'Material':12} {'UTS (MPa)':>10} {'E (GPa)':>10}") print("-" * 34) for name, uts, E in data: print(f"{name:12} {uts:>10.0f} {E:>10.0f}")
pint to attach units to your numbers. It will raise an error if you accidentally add metres to feet, or multiply stress by the wrong area units. Install with pip install pint.from pint import UnitRegistry ureg = UnitRegistry() Q = ureg.Quantity # Attach units to values force = Q(10000, "N") area = Q(50, "mm**2") # Arithmetic is unit-aware stress = force / area print(stress) # 200.0 N/mm² print(stress.to("MPa")) # 200.0 MPa (N/mm² = MPa) print(stress.to("psi")) # 29007.5 psi # Unit mismatch raises error immediately height = Q(100, "ft") # force + height → DimensionalityError: cannot add [force] and [length] # Convert velocity speed = Q(250, "m/s") print(speed.to("knots")) # 486.0 knots print(speed.to("mph")) # 559.2 mph
np.interp is one of the most-used functions in engineering code. Look up aerodynamic coefficients from a table, find material properties at a temperature not in your dataset, interpolate standard atmosphere values.import numpy as np # Cl vs AoA lookup table (from wind tunnel data) aoa_data = np.array([ 0, 2, 4, 6, 8, 10, 12]) CL_data = np.array([ 0.18, 0.42, 0.65, 0.88, 1.05, 1.18, 1.22]) # Interpolate at arbitrary angle aoa_query = 5.5 CL_at_5p5 = np.interp(aoa_query, aoa_data, CL_data) print(f"CL at {aoa_query}°: {CL_at_5p5:.4f}") # Interpolate multiple angles at once query_angles = np.array([1, 3, 5, 7, 9, 11]) CL_interp = np.interp(query_angles, aoa_data, CL_data) print(CL_interp) # scipy for more control (cubic spline — smoother) from scipy.interpolate import CubicSpline cs = CubicSpline(aoa_data, CL_data) print(cs(5.5)) # smoother than linear interp print(cs(5.5, 1)) # derivative: dCL/dα at 5.5°
Pro Cheat Sheet
NumPy, pandas, matplotlib, and C++: condensed into a single reference page.
| Task | Code | Notes |
|---|---|---|
| Create array | np.array([1, 2, 3]) | Basic creation |
| Range of floats | np.linspace(0, 10, 100) | 100 points from 0 to 10 |
| Range with step | np.arange(0, 10, 0.5) | 0, 0.5, 1.0 … 9.5 |
| Zeros / Ones | np.zeros(n) / np.ones(n) | Fill array with 0 or 1 |
| Identity matrix | np.eye(n) | n×n identity |
| Element-wise maths | a + b, a * b, a ** 2 | No loop needed |
| Matrix multiply | A @ B | NOT A * B (that's element-wise) |
| Dot product | np.dot(a, b) | Or a @ b for vectors |
| Transpose | A.T | |
| Inverse | np.linalg.inv(A) | Use solve() instead when possible |
| Solve Ax = b | np.linalg.solve(A, b) | Faster and more stable than inv() |
| Eigenvalues | np.linalg.eig(A) | Returns values and vectors |
| Mean / Std | np.mean(x) / np.std(x, ddof=1) | ddof=1 for sample |
| Min / Max | np.min(x) / np.max(x) | |
| Index of min/max | np.argmin(x) / np.argmax(x) | Returns index, not value |
| Sort | np.sort(x) | Returns sorted copy |
| Filter by condition | x[x > threshold] | Boolean indexing |
| Count condition true | np.sum(x > threshold) | |
| Replace values | np.where(x > 0, x, 0) | Clip negatives to zero |
| Clamp to range | np.clip(x, low, high) | |
| Interpolate | np.interp(x_new, x, y) | Linear interpolation |
| Cumulative sum | np.cumsum(x) | Running total |
| Gradient | np.gradient(y, x) | Numerical derivative dy/dx |
| Trapezoid integral | np.trapz(y, x) | Numerical integration |
| Linear fit | np.polyfit(x, y, 1) | Returns [slope, intercept] |
| Evaluate poly | np.polyval(coeffs, x) | Use after polyfit |
| Trig (degrees) | np.sin(np.radians(angle)) | Always convert first |
| Safe float compare | np.isclose(a, b) | Never use == on floats |
| Check for NaN | np.isnan(x) / np.any(np.isnan(x)) | Always check after loading data |
| NaN-safe mean | np.nanmean(x) | Ignores NaN values |
| Reshape | x.reshape(3, 4) | 3 rows, 4 cols |
| Flatten to 1D | x.flatten() | |
| Stack arrays | np.vstack([a, b]) / np.hstack([a, b]) | Vertical / horizontal |
| Log base 10 | np.log10(x) | For S-N curves, dB |
| Natural log | np.log(x) |
| Task | Code | Notes |
|---|---|---|
| Load CSV | pd.read_csv("file.csv") | |
| Load Excel | pd.read_excel("file.xlsx", sheet_name="Results") | |
| First 5 rows | df.head() | |
| Summary stats | df.describe() | count, mean, std, min, max |
| Column names | df.columns.tolist() | |
| Missing values | df.isnull().sum() | Per column |
| Drop missing rows | df.dropna(subset=["col"]) | |
| Fill missing | df["col"].fillna(0) | Or fillna(df["col"].mean()) |
| Filter rows | df[df["col"] > value] | |
| Multiple conditions | df[(df["a"] > 1) & (df["b"] == "X")] | Parentheses required |
| Select columns | df[["col1", "col2"]] | Double brackets = DataFrame |
| Add column | df["new"] = df["a"] / df["b"] | |
| Rename column | df.rename(columns={"old":"new"}) | |
| Sort | df.sort_values("col", ascending=False) | |
| Group statistics | df.groupby("cat")["val"].mean() | Like pivot table |
| Multiple aggregations | df.groupby("cat").agg(m=("val","mean"), s=("val","std")) | |
| Merge two DataFrames | pd.merge(df1, df2, on="id") | Like VLOOKUP |
| Save to CSV | df.to_csv("out.csv", index=False) | |
| Save to Excel | df.to_excel("out.xlsx", index=False) | |
| Multi-sheet Excel | with pd.ExcelWriter("out.xlsx") as w: df.to_excel(w, sheet_name="S1") |
| Element | Code | Notes |
|---|---|---|
| Figure size | plt.figure(figsize=(10, 6)) | Width × height in inches |
| Line plot | plt.plot(x, y, color="steelblue", lw=2, ls="--") | ls: "-", "--", ":", "-." |
| Scatter plot | plt.scatter(x, y, s=40, c="red", alpha=0.6) | s=marker size |
| Bar chart | plt.bar(categories, values) | |
| Histogram | plt.hist(data, bins=30, edgecolor="white") | |
| Horizontal line | plt.axhline(y=value, color="red", ls="--") | Limit lines, thresholds |
| Vertical line | plt.axvline(x=value, color="gray", ls=":") | |
| Axis labels | plt.xlabel("Time (s)", fontsize=12) | Always label axes |
| Title | plt.title("My Plot", fontsize=14) | |
| Legend | plt.legend(loc="upper right", fontsize=10) | Needs label= in plot() |
| Grid | plt.grid(True, alpha=0.3) | alpha controls opacity |
| Axis limits | plt.xlim(0, 100) / plt.ylim(0, 500) | |
| Log scale | plt.xscale("log") / plt.yscale("log") | S-N curves, frequency plots |
| Annotation | plt.annotate("text", xy=(x,y), xytext=(tx,ty), arrowprops=dict(arrowstyle="->")) | |
| Tight layout | plt.tight_layout() | Prevents label clipping |
| Save high-res | plt.savefig("plot.png", dpi=300, bbox_inches="tight") | 300 dpi for reports |
| Subplots | fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True) | 2 rows, shared x axis |
| Colour map scatter | sc = plt.scatter(x, y, c=z, cmap="plasma"); plt.colorbar(sc) | 3rd variable as colour |
| Task | Code | Notes |
|---|---|---|
| Compile + run | g++ -O2 -std=c++17 -o out file.cpp && ./out | -O2 = optimise, -std=c++17 = modern C++ |
| Compile with maths | g++ -O2 file.cpp -lm -o out | Needed for sqrt, sin etc |
| Integer to float division | double r = (double)a / b; | Cast one to double first |
| Read file line by line | ifstream f("data.txt"); while(getline(f, line)){…} | |
| Write to file | ofstream f("out.txt"); f << value << "\n"; | |
| Vector (dynamic array) | vector<double> v = {1.0, 2.0}; v.push_back(3.0); | Use instead of raw arrays |
| Loop over vector | for (double x : v) { cout << x; } | Range-based for |
| Format output | cout << fixed << setprecision(3) << val; | #include <iomanip> |
| String to number | double x = stod("3.14"); int n = stoi("42"); | |
| Number to string | string s = to_string(42); | |
| Max / Min of two | max(a, b) / min(a, b) | #include <algorithm> |
| Abs value | abs(x) / fabs(x) | fabs for floats |
| Power | pow(base, exp) | #include <cmath> |
| Struct (data bundle) | struct Point { double x, y; }; Point p = {1.0, 2.0}; | Group related data |
| Check file opened | if (!file.is_open()) { cerr << "Error"; return 1; } | Always check |
02: Never use == with floats. Use
np.isclose(a, b) or abs(a - b) < 1e-9.03: Name your constants.
g0 = 9.80665 at the top of the file. Never write 9.81 or 9.8 directly in a formula.04: arr2 = arr does NOT copy. Use
arr2 = arr.copy() in NumPy. For lists: lst2 = lst.copy().05: Trig always in radians. Wrap every degree value:
np.sin(np.radians(45)). Make it a habit.06: Check for NaN before computing stats.
np.any(np.isnan(data)) first. One NaN poisons every mean, std, and sum.07: Save figures before plt.show().
savefig must come before show() or you'll get a blank file.08: Use ddof=1 for sample std dev. Your test specimens are a sample, not the entire population.
09: Comment units, not just values.
force = 10000 # N: future you will thank present you.10: Use pathlib not os.path.
from pathlib import Path: cleaner syntax, works on Windows and Linux identically.
ISA Atmosphere Model
The International Standard Atmosphere is the foundation of every performance calculation in aviation. Before you compute fuel burn, range, or climb rate: you need density, pressure, temperature, and speed of sound at altitude. This is how you compute them correctly in Python and C++.
import numpy as np # ISA constants T0 = 288.15 # K sea-level temperature P0 = 101325.0 # Pa sea-level pressure rho0 = 1.225 # kg/m³ sea-level density L = 0.0065 # K/m troposphere lapse rate g0 = 9.80665 # m/s² R = 287.058 # J/(kg·K) dry air gamma = 1.4 H_trop = 11000 # m — tropopause altitude T_trop = T0 - L * H_trop # 216.65 K at tropopause P_trop = P0 * (T_trop / T0) ** (g0 / (L * R)) def isa(h_m): """ ISA atmosphere for scalar or array altitude h_m (metres). Returns: T (K), P (Pa), rho (kg/m³), a (m/s) """ h = np.atleast_1d(np.asarray(h_m, dtype=float)) T = np.where(h <= H_trop, T0 - L * h, T_trop) P = np.where(h <= H_trop, P0 * (T / T0) ** (g0 / (L * R)), P_trop * np.exp(-g0 * (h - H_trop) / (R * T_trop))) rho = P / (R * T) a = np.sqrt(gamma * R * T) return T.squeeze(), P.squeeze(), rho.squeeze(), a.squeeze() # --- Usage --- altitudes_ft = np.array([0, 10000, 20000, 30000, 35000, 39000]) altitudes_m = altitudes_ft * 0.3048 T, P, rho, a = isa(altitudes_m) print(f"{'Alt (ft)':10} {'T (K)':8} {'P (hPa)':10} {'rho (kg/m³)':13} {'a (kts)'}") print("-"*55) for i, alt in enumerate(altitudes_ft): print(f"{alt:10.0f} {T[i]:8.2f} {P[i]/100:10.2f} {rho[i]:13.4f} {a[i]/0.514444:7.1f}")
def mach_to_tas(M, h_m): """True Airspeed from Mach and altitude""" T, _, _, a = isa(h_m) return M * a # m/s def tas_to_eas(tas, h_m): """Equivalent Airspeed = TAS * sqrt(sigma)""" _, _, rho, _ = isa(h_m) sigma = rho / rho0 return tas * np.sqrt(sigma) def cas_to_tas(cas, h_m): """CAS to TAS using compressibility correction""" T, P, rho, _ = isa(h_m) # Subsonic isentropic relation qc = P0 * ((1 + 0.2 * (cas / (isa(0)[3]))**2) ** 3.5 - 1) M = np.sqrt(5 * ((qc / P + 1) ** (2/7) - 1)) _, _, _, a = isa(h_m) return M * a # Example: A320 cruise at FL350, M0.78 h_cruise = 35000 * 0.3048 # FL350 in metres M = 0.78 TAS = mach_to_tas(M, h_cruise) EAS = tas_to_eas(TAS, h_cruise) print(f"FL350, M{M}:") print(f" TAS = {TAS:.1f} m/s ({TAS/0.514444:.1f} kts)") print(f" EAS = {EAS:.1f} m/s ({EAS/0.514444:.1f} kts)")
TAS = 232.5 m/s (452.0 kts) · EAS = 130.4 m/s (253.5 kts)
The density ratio at FL350 is ~0.31: EAS is TAS × √0.31 = 0.557 × TAS. This is why aircraft stall at the same EAS regardless of altitude.
Cost Index & Fuel Burn
Cost Index (CI) is the single number that tells the FMS how to trade time cost against fuel cost. CI = 0 means fly for minimum fuel. CI = max means fly as fast as possible regardless of fuel. Every airline optimises CI per route, per day, per aircraft type. Here's how to model it.
Cost = CI × Time + Fuel. When CI = 0, only fuel matters: fly at MRC (Maximum Range Cruise). When CI is high, time cost dominates: fly near VMO. The optimal cruise Mach for a given CI sits between these extremes.import numpy as np import matplotlib.pyplot as plt # Aircraft parameters (A320-like) W = 65000 # kg cruise weight S = 122.6 # m² wing area CD0 = 0.0240 # zero-lift drag k = 0.0375 # induced drag factor (1/π·AR·e) eta = 0.30 # overall propulsive efficiency (cruise) LHV = 43.2e6 # J/kg Jet-A lower heating value # Cruise conditions h_m = 35000 * 0.3048 T, P, rho, a = isa(h_m) # Mach sweep M_range = np.linspace(0.65, 0.84, 200) V = M_range * a # TAS, m/s q = 0.5 * rho * V**2 # dynamic pressure CL = W * 9.80665 / (q * S) # level flight CL CD = CD0 + k * CL**2 # total drag D = q * S * CD # drag force, N ff_kgs = D * V / (eta * LHV) # fuel flow, kg/s ff_kghr = ff_kgs * 3600 # fuel flow, kg/hr SR = V / ff_kgs / 1000 # specific range, km/kg # Cost Index analysis # CI units: kg/min (fuel cost equivalent of 1 min of time) def opt_mach_for_ci(CI_kgmin): """Find Mach that minimises total cost for given CI""" time_cost_per_km = CI_kgmin / (V * 60 / 1000) # kg_eq/km from time fuel_cost_per_km = ff_kgs / (V / 1000) # kg/km total_cost = time_cost_per_km + fuel_cost_per_km idx = np.argmin(total_cost) return M_range[idx], ff_kghr[idx] print(f"{'CI (kg/min)':14} {'Opt Mach':12} {'Fuel Flow (kg/hr)'}") print("-"*42) for CI in [0, 10, 20, 40, 60, 80, 100]: M_opt, ff_opt = opt_mach_for_ci(CI) print(f"{CI:14} {M_opt:12.3f} {ff_opt:.1f}") # Plot: fuel flow and SR vs Mach fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4)) ax1.plot(M_range, ff_kghr, color="#2563EB", lw=2) ax1.set_xlabel("Mach"); ax1.set_ylabel("Fuel Flow (kg/hr)") ax1.set_title("Fuel Flow vs Mach"); ax1.grid(True, alpha=0.3) ax2.plot(M_range, SR, color="#16A34A", lw=2) ax2.set_xlabel("Mach"); ax2.set_ylabel("Specific Range (km/kg)") ax2.set_title("Specific Range vs Mach"); ax2.grid(True, alpha=0.3) plt.tight_layout(); plt.show()
import numpy as np def breguet_range(V_ms, LD, SFC_kgNs, W_initial, W_final): """ Breguet range equation. V_ms : cruise TAS, m/s LD : cruise lift-to-drag ratio SFC_kgNs : specific fuel consumption, kg/(N·s) W_initial : initial cruise weight, N W_final : final cruise weight (= W_initial - W_fuel), N Returns : range in km """ return (V_ms / (SFC_kgNs * 9.80665)) * LD * np.log(W_initial / W_final) / 1000 # A320neo-like parameters MTOW = 79000 # kg — max takeoff weight OEW = 42600 # kg — operating empty weight payload = 15000 # kg — passengers + bags (~150 pax) fuel = 18000 # kg — usable fuel h_m = 35000 * 0.3048 _, _, _, a = isa(h_m) V_cruise= 0.78 * a # TAS at M0.78, FL350 LD = 17.5 # typical A320neo cruise L/D SFC = 1.65e-5 # kg/(N·s) — LEAP-1A26 typical W_init = (OEW + payload + fuel) * 9.80665 W_final = (OEW + payload) * 9.80665 R = breguet_range(V_cruise, LD, SFC, W_init, W_final) print(f"Cruise Mach: M0.78") print(f"TAS: {V_cruise/0.514444:.1f} kts") print(f"L/D: {LD}") print(f"Fuel burned: {fuel:,} kg") print(f"Range (Breguet): {R:.0f} km ({R/1.852:.0f} nm)") # Sensitivity: range vs payload (range-payload curve) payloads = np.linspace(0, 20000, 100) ranges = [] for pl in payloads: fuel_avail = MTOW - OEW - pl # fuel limited by MTOW fuel_avail = np.clip(fuel_avail, 0, 18000) Wi = (OEW + pl + fuel_avail) * 9.80665 Wf = (OEW + pl) * 9.80665 ranges.append(breguet_range(V_cruise, LD, SFC, Wi, Wf)) import matplotlib.pyplot as plt plt.figure(figsize=(9, 5)) plt.plot(payloads/1000, ranges, color="#2563EB", lw=2) plt.xlabel("Payload (tonnes)"); plt.ylabel("Range (km)") plt.title("Range–Payload Curve (A320neo-like)") plt.grid(True, alpha=0.3); plt.tight_layout(); plt.show()
Aircraft Performance
Performance engineering calculates what an aircraft can actually do: climb rate, cruise ceiling, takeoff distance, fuel to destination. These are the numbers that go into flight manuals, dispatch releases, and weight & balance calculations.
import numpy as np from scipy.integrate import solve_ivp # Aircraft (A320-like at typical climb weight) W_N = 70000 * 9.80665 # N S = 122.6 # m² CD0 = 0.0280 # climb config (no flaps) k = 0.0375 SFC = 1.80e-5 # kg/(N·s) climb SFC (higher than cruise) # Thrust model: T = T_sl * (rho/rho0)^0.75 (simplified) T_SL = 120000 # N — total SL static thrust (2 × CFM56) def climb_ode(t, y): """y = [altitude m, fuel burned kg]""" h, mf = y T_ISA, P, rho, a = isa(h) T = T_SL * (rho / 1.225) ** 0.75 # thrust at altitude M = 0.76 # constant Mach climb above 10000ft V = M * a q = 0.5 * rho * V**2 CL = W_N / (q * S) CD = CD0 + k * CL**2 D = q * S * CD ROC = V * (T - D) / W_N # m/s rate of climb ff = T * SFC # kg/s fuel flow return [ROC, ff] # Integrate from 3000ft to FL350 h_init = 3000 * 0.3048 h_end = 35000 * 0.3048 # Stop event: when altitude reaches FL350 def reached_cruise(t, y): return y[0] - h_end reached_cruise.terminal = True sol = solve_ivp(climb_ode, [0, 3600], [h_init, 0], events=reached_cruise, max_step=10, dense_output=True) t_climb = sol.t_events[0][0] fuel_climb = sol.y_events[0][0][1] print(f"Time to climb FL030→FL350: {t_climb/60:.1f} min") print(f"Fuel to climb: {fuel_climb:.0f} kg") print(f"Average ROC: {(h_end-h_init)/t_climb*196.85:.0f} fpm")
import numpy as np def opt_altitude(W_kg, M=0.78): """ Find altitude that maximises specific range for given weight and Mach. Uses simplified drag model. """ altitudes = np.linspace(25000, 43000, 200) * 0.3048 best_SR, best_alt = 0, altitudes[0] for h in altitudes: T, P, rho, a = isa(h) V = M * a q = 0.5 * rho * V**2 CL = W_kg * 9.80665 / (q * 122.6) CD = 0.0240 + 0.0375 * CL**2 D = q * 122.6 * CD ff = D * V / (0.30 * 43.2e6) SR = V / ff / 1000 if SR > best_SR: best_SR, best_alt = SR, h return best_alt / 0.3048, best_SR # Track optimum altitude as fuel burns off during flight weights = np.arange(75000, 52000, -1000) # kg, fuel burning off print(f"{'Weight (kg)':14} {'Opt Alt (ft)':15} {'Spec. Range (km/kg)'}") print("-"*46) for W in weights[::5]: # print every 5th step alt, sr = opt_altitude(W) print(f"{W:14,} {alt:15.0f} {sr:.4f}")
Structural Loads & Fatigue
Every flight imposes loads on the airframe: manoeuvres, gusts, landing impacts. Fatigue is the accumulation of damage from repeated loads over thousands of flights. These tools let you build V-n diagrams, plot S-N curves, and apply Miner's rule for damage summation.
import numpy as np import matplotlib.pyplot as plt # Aircraft parameters W = 70000 # kg — design weight S = 122.6 # m² CLmax = 1.52 # clean configuration CLmin = -0.80 # negative lift limit # CS-25 load factor limits n_lim_pos = 2.5 # positive limit (CS25.337) n_lim_neg = -1.0 # negative limit n_ult_pos = n_lim_pos * 1.5 # 1.5 × limit = ultimate # Reference conditions (sea level ISA) rho = 1.225 g = 9.80665 W_N = W * g # Stall speed at 1g (EAS) Vs = np.sqrt(2 * W_N / (rho * S * CLmax)) # m/s EAS Vs_kts= Vs / 0.514444 # Manoeuvre speed Va = Vs * sqrt(n_lim) Va = Vs * np.sqrt(n_lim_pos) Vc = 175 # m/s EAS — design cruise speed Vd = 210 # m/s EAS — design dive speed # Stall boundary curves V_range = np.linspace(0, Vd, 500) n_pos = (rho * V_range**2 * S * CLmax) / (2 * W_N) # positive stall n_neg = (rho * V_range**2 * S * CLmin) / (2 * W_N) # negative stall fig, ax = plt.subplots(figsize=(10, 6)) # Positive stall boundary (up to n_lim) mask_pos = n_pos <= n_lim_pos ax.plot(V_range[mask_pos] / 0.514444, n_pos[mask_pos], color="#2563EB", lw=2, label="Stall boundary") # Structural limits ax.axhline(n_lim_pos, color="#DC2626", lw=1.5, ls="--", label=f"Limit load n={n_lim_pos}g") ax.axhline(n_ult_pos, color="#DC2626", lw=1, ls=":", label=f"Ultimate load n={n_ult_pos}g") ax.axhline(n_lim_neg, color="#2563EB", lw=1.5, ls="--", label=f"Negative limit n={n_lim_neg}g") # Vertical speed limits ax.axvline(Va / 0.514444, color="gray", lw=1, ls=":", label=f"Va = {Va/0.514444:.0f} kts") ax.axvline(Vc / 0.514444, color="green", lw=1, ls=":", label=f"Vc = {Vc/0.514444:.0f} kts") ax.axvline(Vd / 0.514444, color="red", lw=1, ls=":", label=f"Vd = {Vd/0.514444:.0f} kts") ax.fill_between(V_range[mask_pos] / 0.514444, n_pos[mask_pos], 0, alpha=0.07, color="#2563EB") ax.set_xlabel("EAS (knots)"); ax.set_ylabel("Load Factor n (g)") ax.set_title("V-n Diagram — CS-25 Transport") ax.legend(fontsize=9); ax.grid(True, alpha=0.3) ax.set_xlim(0, Vd/0.514444 + 20); ax.set_ylim(-1.5, 4.5) plt.tight_layout(); plt.show() print(f"1g stall speed: {Vs_kts:.1f} kts EAS") print(f"Va (manoeuvre): {Va/0.514444:.1f} kts EAS")
import numpy as np # S-N curve for Al 2024-T3 (log-log linear fit from AV3 regression) # log10(N) = a - b * log10(sigma) a_sn = 15.2 b_sn = 4.8 def cycles_to_failure(sigma_MPa): """N from S-N curve""" return 10 ** (a_sn - b_sn * np.log10(sigma_MPa)) # Mission spectrum: typical short-haul aircraft # Each row: [stress amplitude MPa, occurrences per flight] spectrum = np.array([ [250, 1 ], # rotation / landing [180, 3 ], # turbulence moderate [120, 8 ], # light turbulence [80, 25 ], # manoeuvres [50, 100 ], # pressurisation cycles within flight [30, 500 ], # minor vibration ]) sigma = spectrum[:, 0] n_occ = spectrum[:, 1] N_f = cycles_to_failure(sigma) D_per_flight = np.sum(n_occ / N_f) # Miner's sum per flight life_flights = 1 / D_per_flight # flights until D = 1.0 print(f"{'Stress (MPa)':15} {'n/flight':12} {'N_f':15} {'n/N (damage)'}") print("-"*55) for i in range(len(sigma)): print(f"{sigma[i]:15.0f} {n_occ[i]:12.0f} {N_f[i]:15.2e} {n_occ[i]/N_f[i]:.2e}") print(f"\nDamage per flight: {D_per_flight:.2e}") print(f"Predicted life: {life_flights:,.0f} flights") print(f" {life_flights/365:.0f} years at 1 flight/day") # Current fleet status flights_flown = 12500 damage_accrued = flights_flown * D_per_flight remaining_life = (1 - damage_accrued) / D_per_flight print(f"\nFlights flown: {flights_flown:,}") print(f"Damage accrued: {damage_accrued:.3f} ({damage_accrued*100:.1f}%)") print(f"Remaining life: {remaining_life:,.0f} flights")
Aerodynamic Data Analysis
Wind tunnel and flight test produce tables of numbers. This module shows how to extract aerodynamic coefficients, build drag polars, fit lift curves, and compute induced drag efficiency: all from raw data.
import numpy as np import matplotlib.pyplot as plt # Flight test data: (CL, CD) pairs from level-flight test points # Each row is one test condition (different weight/altitude/speed) CL_data = np.array([0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 1.00, 1.10]) CD_data = np.array([0.0258, 0.0268, 0.0284, 0.0311, 0.0347, 0.0393, 0.0454, 0.0529, 0.0625, 0.0741]) # Fit CD = CD0 + k * CL^2 → linear regression on (CL², CD) CL2 = CL_data**2 coeffs = np.polyfit(CL2, CD_data, 1) k_fit = coeffs[0] CD0_fit = coeffs[1] # Oswald efficiency factor AR = 9.4 # A320 aspect ratio e_oswald = 1 / (np.pi * AR * k_fit) print(f"Drag polar fit:") print(f" CD0 = {CD0_fit:.5f}") print(f" k = {k_fit:.5f}") print(f" e = {e_oswald:.3f} (Oswald efficiency)") # Maximum L/D and the CL at which it occurs CL_maxLD = np.sqrt(CD0_fit / k_fit) LD_max = CL_maxLD / (2 * np.sqrt(CD0_fit * k_fit)) print(f" CL at max L/D = {CL_maxLD:.3f}") print(f" Max L/D = {LD_max:.1f}") # Plot the polar CL_curve = np.linspace(0, 1.3, 200) CD_curve = CD0_fit + k_fit * CL_curve**2 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5)) # CL vs CD ax1.scatter(CD_data, CL_data, color="steelblue", s=50, zorder=5, label="Test data") ax1.plot(CD_curve, CL_curve, "r-", lw=2, label=f"Fit: CD0={CD0_fit:.4f}, k={k_fit:.4f}") ax1.axvline(2*np.sqrt(CD0_fit*k_fit), color="green", ls="--", lw=1, label=f"Min CD (max L/D)") ax1.set_xlabel("CD"); ax1.set_ylabel("CL") ax1.set_title("Drag Polar"); ax1.legend(fontsize=8); ax1.grid(True, alpha=0.3) # L/D vs CL LD_curve = CL_curve / CD_curve ax2.plot(CL_curve, LD_curve, color="#16A34A", lw=2) ax2.axvline(CL_maxLD, color="red", ls="--", lw=1, label=f"Best CL={CL_maxLD:.3f}, L/D={LD_max:.1f}") ax2.set_xlabel("CL"); ax2.set_ylabel("L/D") ax2.set_title("Lift-to-Drag Ratio"); ax2.legend(fontsize=9); ax2.grid(True, alpha=0.3) plt.tight_layout(); plt.show()
Flight Data Analysis · FDR & QAR
Every commercial aircraft records hundreds of parameters every second into the Flight Data Recorder (FDR) and Quick Access Recorder (QAR). Airlines use QAR data for safety monitoring, fuel analysis, and exceedance detection. This is how you process it with Python.
import pandas as pd import numpy as np import matplotlib.pyplot as plt # QAR data is usually a CSV or binary file exported from the ACMS # Typical columns: time, altitude, airspeed, heading, N1%, temp, # fuel_flow, load_factor, pitch, roll, etc. df = pd.read_csv("flight_WB123_20260322.csv") # Basic inspection print(f"Flight duration: {len(df)/4:.0f} seconds ({len(df)/4/60:.1f} min)") # 4 Hz data print(f"Parameters: {list(df.columns)}") print(df.describe().round(2)) # Check for invalid/missing data print("\nMissing values:") print(df.isnull().sum()) # Replace clearly invalid sensor spikes (e.g. altitude = 0 mid-flight) df["altitude_ft"] = df["altitude_ft"].replace(0, np.nan).interpolate()
import pandas as pd import numpy as np # Define exceedance thresholds (simplified A320 limits) limits = { "load_factor_nz": (-0.5, 2.0 ), # g — smooth air ops limits "pitch_deg": (-5.0, 25.0), # deg "bank_angle_deg": (-30.0, 30.0), # deg "N1_pct": (0, 104.0), # % "airspeed_kts": (0, 350.0), # kts below FL100 "vsi_fpm": (-2000, 2000 ), # fpm vertical speed } exceedances = [] for param, (lo, hi) in limits.items(): if param not in df.columns: continue col = df[param] exceed= (col < lo) | (col > hi) if exceed.any(): # Find contiguous exceedance windows changes = exceed.astype(int).diff().fillna(0) starts = df.index[changes == 1] ends = df.index[changes == -1] for s, e in zip(starts, ends): window = col.loc[s:e] peak_val = window.max() if window.max() > hi else window.min() duration = (e - s) / 4 # seconds (4Hz data) exceedances.append({ "parameter": param, "limit": hi if peak_val > hi else lo, "peak_value": peak_val.round(2), "duration_s": duration, "time_s": s / 4 }) result = pd.DataFrame(exceedances) if result.empty(): print("No exceedances detected. Clean flight.") else: print(f"{len(result)} exceedances detected:") print(result.to_string(index=False))
import pandas as pd import numpy as np import matplotlib.pyplot as plt # Load month of flight records (from OPS system export) flights = pd.read_csv("fleet_march2026.csv") # Expected columns: flight_no, reg, route, dist_nm, fuel_plan_kg, # fuel_actual_kg, pax, cargo_kg, wind_comp_kts # Fuel delta: actual minus planned flights["fuel_delta"] = flights["fuel_actual_kg"] - flights["fuel_plan_kg"] flights["fuel_delta_pct"] = flights["fuel_delta"] / flights["fuel_plan_kg"] * 100 flights["fuel_per_nm"] = flights["fuel_actual_kg"] / flights["dist_nm"] # Fleet summary print("=== Fleet Fuel Performance — March 2026 ===") print(f"Flights analysed: {len(flights)}") print(f"Mean fuel delta: {flights.fuel_delta.mean():.0f} kg ({flights.fuel_delta_pct.mean():.1f}%)") print(f"Std deviation: {flights.fuel_delta.std():.0f} kg") print(f"Worst flight: +{flights.fuel_delta.max():.0f} kg") print(f"Best flight: {flights.fuel_delta.min():.0f} kg") # Per-tail analysis (identify aircraft with chronic over-burn) per_reg = flights.groupby("reg").agg( flights = ("flight_no", "count"), mean_delta_kg = ("fuel_delta", "mean"), total_excess = ("fuel_delta", "sum") ).round(1).sort_values("mean_delta_kg", ascending=False) print("\nPer-aircraft fuel performance:") print(per_reg.head(10)) # Correlation: wind vs fuel delta corr = flights["wind_comp_kts"].corr(flights["fuel_delta"]) print(f"\nWind vs fuel delta correlation: {corr:.3f}") # Strong negative correlation expected: headwind → more fuel # Plot distribution of fuel delta plt.figure(figsize=(9, 4)) plt.hist(flights["fuel_delta"], bins=40, color="steelblue", edgecolor="white") plt.axvline(0, color="red", lw=1.5, label="Plan") plt.axvline(flights.fuel_delta.mean(), color="orange", lw=1.5, ls="--", label=f"Mean: {flights.fuel_delta.mean():.0f} kg") plt.xlabel("Fuel Delta (kg)"); plt.ylabel("Flights") plt.title("Actual vs Planned Fuel — March 2026") plt.legend(); plt.tight_layout(); plt.show()
isa(h_m) → T, P, rho, a at any altitudemach_to_tas(M, h) → TAS in m/sbreguet_range(V, LD, SFC, Wi, Wf) → range in kmnp.polyfit(CL**2, CD, 1) → drag polar coefficients [k, CD0]np.sqrt(CD0/k) → CL at max L/Ddf[df["nz"] > 2.0] → filter exceedances from QAR datadf.groupby("reg")["fuel_delta"].mean() → per-aircraft fuel performancesolve_ivp(eom, [0,T], y0, events=cruise_reached) → integrate climb to cruise
Cleaning Messy Data
Real data is broken. Missing values, impossible readings, duplicate rows, columns that are numbers stored as text. Every analysis starts here. Get this wrong and every result downstream is wrong too.
import pandas as pd import numpy as np df = pd.read_csv("sensor_data.csv") print(df.shape) # (rows, columns) print(df.dtypes) # spot columns stored as wrong type print(df.isnull().sum()) # missing values per column print(df.duplicated().sum()) # duplicate rows print(df.describe()) # min/max will expose impossible values # Spot columns that should be numeric but aren't for col in df.select_dtypes('object').columns: sample = df[col].dropna().head(3).tolist() print(f"{col}: {sample}")
# Drop rows missing critical values df = df.dropna(subset=["timestamp", "altitude_ft"]) # Fill sensor gaps with forward-fill (last known value) df["temperature"] = df["temperature"].ffill() # Fill with column mean (for scattered random gaps) df["pressure"] = df["pressure"].fillna(df["pressure"].mean()) # Interpolate (smooth gap fill — good for time series) df["airspeed"] = df["airspeed"].interpolate(method="linear") # Check NaN-safe mean vs regular mean print(np.nanmean(df["load_factor"])) # ignores NaN print(np.mean(df["load_factor"])) # returns NaN if any NaN present
import numpy as np col = df["altitude_ft"] # Method 1: IQR fence (robust to skewed distributions) Q1, Q3 = col.quantile([0.25, 0.75]) IQR = Q3 - Q1 lo, hi = Q1 - 1.5*IQR, Q3 + 1.5*IQR df_clean = df[col.between(lo, hi)] print(f"Removed {len(df) - len(df_clean)} outliers") # Method 2: Z-score (assumes normal distribution) from scipy import stats z = np.abs(stats.zscore(col.dropna())) df_clean = df[z < 3] # keep within 3 standard deviations # Method 3: Physical bounds (best for engineering data) # You know the sensor range — use it df = df[df["altitude_ft"].between(-1000, 60000)] df = df[df["airspeed_kts"].between(0, 600)] df = df[df["load_factor"].between(-3, 5)]
# Columns stored as strings when they should be numbers df["speed"] = pd.to_numeric(df["speed"], errors="coerce") # errors="coerce" turns unparseable values into NaN instead of crashing # Mixed units in one column — e.g. "250 kts" and "128 m/s" def parse_speed(val): val = str(val).strip() if "kts" in val: return float(val.replace("kts", "")) * 0.514444 # to m/s elif "m/s" in val: return float(val.replace("m/s", "")) return np.nan df["speed_ms"] = df["speed_raw"].apply(parse_speed) # Standardise column names (common with multi-source data) df.columns = (df.columns .str.strip() .str.lower() .str.replace(" ", "_") .str.replace("(", "") .str.replace(")", "") ) # Remove exact duplicate rows df = df.drop_duplicates() df = df.drop_duplicates(subset=["timestamp", "sensor_id"])
1. Check shape and dtypes
2. Count NaN per column: decide: drop, fill, or interpolate
3. Check for duplicates
4. Run describe(): spot impossible min/max values
5. Standardise column names and units
6. Convert string-encoded numbers with pd.to_numeric(errors="coerce")
7. Assert final shape and NaN count before proceeding
Time Series Data
Sensor logs, QAR data, weather records, stock prices: all arrive as time series. Pandas has a full datetime index system built for this. Resampling, rolling windows, time-zone handling, gap detection.
import pandas as pd # Parse timestamps on load — always specify format if you know it df = pd.read_csv("flight_log.csv", parse_dates=["timestamp"], date_format="%Y-%m-%dT%H:%M:%S.%f") # Set as index — unlocks all time-series operations df = df.set_index("timestamp").sort_index() # Extract components df["hour"] = df.index.hour df["date"] = df.index.date df["weekday"] = df.index.day_name() # Slice by time range morning = df["08:00":"12:00"] # today's morning march = df["2026-03"] # entire month window = df["2026-03-01":"2026-03-15"] # Check for time gaps (critical for sensor data) time_diff = df.index.to_series().diff() gaps = time_diff[time_diff > pd.Timedelta("1s")] print(f"Gaps found: {len(gaps)}") print(gaps)
# Downsample: 100 Hz → 1 Hz (mean over each 1-second window) df_1hz = df.resample("1s").mean() # Downsample to 1-minute max load factor df_1min_max = df["load_factor"].resample("1min").max() # Upsample and interpolate (fill gaps to uniform spacing) df_uniform = df.resample("10ms").interpolate(method="time") # Common resample strings: # "10ms" = 10 milliseconds "1s" = 1 second # "1min" = 1 minute "1h" = 1 hour # "1D" = 1 day "1W" = 1 week # "1ME" = 1 month end "1YE" = 1 year end # Multiple aggregations in one pass summary = df.resample("1min").agg({ "altitude_ft": ["mean", "max"], "airspeed_kts": ["mean", "std"], "load_factor": ["min", "max"] })
# Rolling mean — smooth out noise (window = number of samples) df["alt_smooth"] = df["altitude_ft"].rolling(window=50).mean() # Rolling std — detect sudden changes / turbulence events df["alt_std"] = df["altitude_ft"].rolling(window=50).std() # Rolling max — peak load in any 5-second window df["peak_nz"] = df["load_factor"].rolling(window=500).max() # Exponential weighted mean (more weight on recent points) df["ema_speed"] = df["airspeed_kts"].ewm(span=20).mean() # Rate of change — derivative (delta per sample) df["d_altitude"] = df["altitude_ft"].diff() / df["altitude_ft"].index.to_series().diff().dt.total_seconds() # d_altitude is now rate of climb in ft/s import matplotlib.pyplot as plt fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=True) ax1.plot(df.index, df["altitude_ft"], alpha=0.3, label="Raw") ax1.plot(df.index, df["alt_smooth"], lw=2, label="Smoothed") ax2.plot(df.index, df["alt_std"], color="red", label="Turbulence proxy") for ax in (ax1, ax2): ax.legend(); ax.grid(True, alpha=0.3) plt.tight_layout(); plt.show()
Reading Any File Format
Engineering data arrives in CSV, Excel, JSON, HDF5, Parquet, binary .dat, and occasionally formats invented by one vendor in 1994 and never updated since. This page covers them all.
import pandas as pd # CSV with non-standard delimiters (semicolon, tab) df = pd.read_csv("data.csv", sep=";") df = pd.read_csv("data.tsv", sep="\t") # CSV with metadata rows at top (common in lab exports) df = pd.read_csv("test.csv", skiprows=6, header=0) # CSV with multiple headers (units row below column names) df = pd.read_csv("test.csv", header=[0, 1]) df.columns = ["_".join(col).strip() for col in df.columns] # Read specific columns only (large files) df = pd.read_csv("big.csv", usecols=["time", "altitude", "speed"]) # Read in chunks (files too large for memory) chunks = [] for chunk in pd.read_csv("huge.csv", chunksize=100_000): chunks.append(chunk[chunk["altitude"] > 10000]) # filter before concat df = pd.concat(chunks, ignore_index=True) # Excel — read all sheets xl = pd.ExcelFile("results.xlsx") print(xl.sheet_names) df = xl.parse("Sheet1")
import pandas as pd import json # Simple flat JSON df = pd.read_json("data.json") # Nested JSON — flatten it with open("flight.json") as f: data = json.load(f) # If structure is {"flight": {"params": [...]}} df = pd.json_normalize(data["flight"]["params"]) # Deeply nested: expand a column of dicts # df["nested_col"] = [{"x": 1, "y": 2}, {"x": 3, "y": 4}, ...] expanded = pd.json_normalize(df["nested_col"]) df = df.drop("nested_col", axis=1).join(expanded)
import h5py import numpy as np import pandas as pd # Read HDF5 with h5py — inspect structure first with h5py.File("simulation.h5", "r") as f: print(list(f.keys())) # top-level groups def print_structure(name, obj): print(name, type(obj).__name__) f.visititems(print_structure) # full tree # Read a dataset pressure = f["flow_field/pressure"][:] # [:] loads into numpy array coords = f["mesh/coordinates"][:] # Read HDF5 with pandas (simpler for tabular data) df = pd.read_hdf("results.h5", key="/run_001/measurements") # Write HDF5 (efficient storage for large DataFrames) df.to_hdf("output.h5", key="/results", mode="w", complevel=6)
import pandas as pd # Parquet: columnar format, 5-20x smaller than CSV, much faster to read df = pd.read_parquet("flight_data.parquet") # Read only specific columns (doesn't load rest from disk) df = pd.read_parquet("data.parquet", columns=["time", "altitude"]) # Write (replace your CSVs with this for anything > 10 MB) df.to_parquet("output.parquet", index=False, compression="snappy") # Benchmark: 1 million row flight dataset # CSV read: ~4.2 seconds, 180 MB on disk # Parquet read: ~0.3 seconds, 12 MB on disk
import struct import numpy as np import pandas as pd # Example format: each record = timestamp (uint32) + 4 float32 channels # Record size = 4 + 4*4 = 20 bytes record_fmt = "<I4f" # little-endian: uint32 + 4×float32 record_size = struct.calcsize(record_fmt) records = [] with open("test_run.dat", "rb") as f: while True: raw = f.read(record_size) if len(raw) < record_size: break t, ch1, ch2, ch3, ch4 = struct.unpack(record_fmt, raw) records.append((t, ch1, ch2, ch3, ch4)) df = pd.DataFrame(records, columns=["timestamp_ms", "pressure", "temperature", "flow", "voltage"]) # numpy fromfile — faster for uniform-type binary arrays data = np.fromfile("raw_samples.dat", dtype=np.float32) data = data.reshape(-1, 4) # reshape into (n_samples, 4_channels)
Merging and Joining Datasets
Test data rarely arrives in one file. You have sensor readings from one system, test conditions from another, and post-processed results in a third. Combining them correctly is where most data pipelines break.
import pandas as pd # Two DataFrames sharing a key column # sensor_df: specimen_id, time, strain, load # meta_df: specimen_id, material, thickness, heat_treatment # Inner join — only rows with matching IDs in both combined = pd.merge(sensor_df, meta_df, on="specimen_id", how="inner") # Left join — keep all sensor rows, attach meta where available combined = pd.merge(sensor_df, meta_df, on="specimen_id", how="left") # Merge on multiple keys combined = pd.merge(df1, df2, on=["flight_id", "leg_number"], how="inner") # Different column names in each DataFrame combined = pd.merge(df1, df2, left_on="test_ref", right_on="specimen_id", how="left") # Diagnose a failed merge immediately print(f"df1 rows: {len(df1)}, df2 rows: {len(df2)}, merged: {len(combined)}") # If merged << df1, keys don't match — inspect both print(df1["specimen_id"].unique()[:5]) print(df2["specimen_id"].unique()[:5])
import pandas as pd # gps_df: 1 Hz — timestamp, lat, lon, altitude # imu_df: 100 Hz — timestamp, accel_x, accel_y, accel_z # Both must be sorted by timestamp first gps_df = gps_df.sort_values("timestamp") imu_df = imu_df.sort_values("timestamp") # Merge: for each GPS row, find the nearest IMU reading combined = pd.merge_asof( imu_df, gps_df, on="timestamp", direction="nearest", # or "backward", "forward" tolerance=pd.Timedelta("500ms") # don't match if gap > 500ms ) # Resample both to a common frequency first (cleaner approach) gps_10hz = gps_df.set_index("timestamp").resample("100ms").interpolate() imu_10hz = imu_df.set_index("timestamp").resample("100ms").mean() combined = gps_10hz.join(imu_10hz, how="inner")
import pandas as pd import glob, os # Load every CSV in the folder and stack them files = glob.glob("test_runs/*.csv") dfs = [] for path in files: df = pd.read_csv(path) df["source_file"] = os.path.basename(path) # track origin dfs.append(df) # Concat — ignore_index resets row numbers, keys labels each source all_data = pd.concat(dfs, ignore_index=True) print(f"Total rows: {len(all_data):,} from {len(files)} files") # Concat with hierarchical index (keep file identity) all_keyed = pd.concat(dfs, keys=[os.path.basename(f) for f in files]) # Access one file's data: all_keyed.loc["run_003.csv"] # Verify: check for column mismatches across files col_sets = [set(df.columns) for df in dfs] if len(set.union(*col_sets)) != len(set.intersection(*col_sets)): print("WARNING: column mismatch across files") for s in col_sets: print(s)
Automating Reports
The last mile of any analysis is communication. Generating formatted reports programmatically: PDF, Word, or HTML: means the report updates automatically when the data changes. No manual copy-paste, no formatting drift.
import pandas as pd import matplotlib.pyplot as plt import base64 from io import BytesIO from datetime import datetime def fig_to_b64(fig): """Convert matplotlib figure to base64 string for embedding in HTML.""" buf = BytesIO() fig.savefig(buf, format="png", dpi=150, bbox_inches="tight") buf.seek(0) return base64.b64encode(buf.read()).decode() # Load and process data df = pd.read_csv("test_results.csv") summary = df.describe().round(3) # Create a plot fig, ax = plt.subplots(figsize=(10, 4)) ax.plot(df["time"], df["stress_MPa"], lw=1.5) ax.set_xlabel("Time (s)"); ax.set_ylabel("Stress (MPa)") ax.grid(True, alpha=0.3) img_b64 = fig_to_b64(fig) plt.close(fig) # Build HTML html = f"""<!DOCTYPE html><html><head> <style>body{{font-family:sans-serif;max-width:960px;margin:40px auto;}} table{{border-collapse:collapse;width:100%;}} th,td{{border:1px solid #ddd;padding:8px;text-align:right;}} th{{background:#f5f5f5;}} </style></head><body> <h1>Test Report</h1> <p>Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}</p> <h2>Summary Statistics</h2> {summary.to_html()} <h2>Stress vs Time</h2> <img src="data:image/png;base64,{img_b64}" style="width:100%"> </body></html>""" with open("report.html", "w") as f: f.write(html) print("Saved report.html")
from docx import Document from docx.shared import Inches, Pt import pandas as pd doc = Document() # Title and metadata doc.add_heading("Fatigue Test Report — Al 2024-T3", level=0) doc.add_paragraph(f"Author: Noor Keshaish\nDate: {datetime.now().strftime('%d %b %Y')}") # Section heading doc.add_heading("Summary Statistics", level=1) # Table from DataFrame df = pd.read_csv("results.csv") summary = df.describe().reset_index() table = doc.add_table(rows=1, cols=len(summary.columns)) table.style = "Table Grid" # Header row for i, col in enumerate(summary.columns): table.rows[0].cells[i].text = str(col) # Data rows for _, row in summary.iterrows(): cells = table.add_row().cells for i, val in enumerate(row): cells[i].text = str(round(val, 3) if isinstance(val, float) else val) # Embed a figure doc.add_heading("Stress-Strain Curve", level=1) doc.add_picture("stress_strain.png", width=Inches(5.5)) doc.save("fatigue_report.docx") print("Saved fatigue_report.docx")
import matplotlib.pyplot as plt from matplotlib.backends.backend_pdf import PdfPages import pandas as pd df = pd.read_csv("test_results.csv") with PdfPages("test_report.pdf") as pdf: # Page 1: title page as text fig = plt.figure(figsize=(11, 8.5)) fig.text(0.5, 0.6, "Test Analysis Report", ha="center", size=28) fig.text(0.5, 0.5, f"Generated: {datetime.now():%d %b %Y}", ha="center", size=14) pdf.savefig(fig); plt.close(fig) # Page 2: stress-strain plot fig, ax = plt.subplots(figsize=(11, 8.5)) ax.plot(df["strain"], df["stress"], lw=2) ax.set_xlabel("Strain (%)"); ax.set_ylabel("Stress (MPa)") ax.set_title("Stress-Strain Curve"); ax.grid(True, alpha=0.3) pdf.savefig(fig); plt.close(fig) # Page 3: summary table rendered as a figure fig, ax = plt.subplots(figsize=(11, 4)) ax.axis("off") summary = df.describe().round(2) tbl = ax.table(cellText=summary.values, rowLabels=summary.index, colLabels=summary.columns, loc="center", cellLoc="right") tbl.auto_set_font_size(False); tbl.set_fontsize(9) pdf.savefig(fig, bbox_inches="tight"); plt.close(fig) print("Saved test_report.pdf (3 pages)")
Plotly: Interactive Plots
Matplotlib produces static images. Plotly produces interactive HTML: hover for values, zoom, pan, toggle traces on/off, export to PNG. One function call to go from data to an interactive browser chart.
import plotly.express as px import plotly.graph_objects as go import pandas as pd df = pd.read_csv("flight_data.csv") # Line plot — interactive, hover shows values fig = px.line(df, x="time", y="altitude_ft", title="Altitude vs Time", labels={"altitude_ft": "Altitude (ft)", "time": "Time (s)"}) fig.show() # opens in browser # Multiple traces on one plot fig = go.Figure() fig.add_trace(go.Scatter(x=df["time"], y=df["altitude_ft"], name="Altitude", yaxis="y1")) fig.add_trace(go.Scatter(x=df["time"], y=df["airspeed_kts"], name="Airspeed", yaxis="y2")) fig.update_layout( yaxis =dict(title="Altitude (ft)"), yaxis2=dict(title="Airspeed (kts)", overlaying="y", side="right") ) fig.show() # Scatter with colour mapped to a third variable fig = px.scatter(df, x="airspeed_kts", y="load_factor", color="altitude_ft", size="fuel_flow", hover_data=["time", "phase"], title="Load Factor vs Airspeed") fig.show()
# Save as standalone HTML — send to anyone, no Python needed fig.write_html("flight_analysis.html") # Save as static image (requires kaleido: pip install kaleido) fig.write_image("plot.png", width=1200, height=600, scale=2) fig.write_image("plot.pdf") # vector PDF for reports # Subplots from plotly.subplots import make_subplots fig = make_subplots(rows=3, cols=1, shared_xaxes=True, subplot_titles=["Altitude", "Airspeed", "Load Factor"]) fig.add_trace(go.Scatter(x=df["time"], y=df["altitude_ft"], name="Alt"), row=1, col=1) fig.add_trace(go.Scatter(x=df["time"], y=df["airspeed_kts"], name="IAS"), row=2, col=1) fig.add_trace(go.Scatter(x=df["time"], y=df["load_factor"], name="Nz"), row=3, col=1) fig.update_layout(height=700, title="Flight Data Overview") fig.show()
Use matplotlib for: publication figures, embedded plots in reports, precise layout control, LaTeX labels.
Use Plotly for: exploratory analysis where you want to inspect values, dashboards, anything you'll share as HTML, multi-axis time series with large datasets.
Seaborn: Statistical Plots
Seaborn wraps matplotlib with statistical plot types that would take 30 lines to build manually: distribution plots, correlation heatmaps, pair plots, regression overlays. Built for data that has categories and distributions.
import seaborn as sns import matplotlib.pyplot as plt import pandas as pd df = pd.read_csv("material_tests.csv") # Histogram + KDE (kernel density estimate) sns.histplot(df["UTS_MPa"], kde=True, bins=30) plt.xlabel("UTS (MPa)"); plt.show() # Compare distributions across groups sns.histplot(data=df, x="UTS_MPa", hue="material", kde=True) plt.show() # Box plot — show median, IQR, and outliers sns.boxplot(data=df, x="material", y="UTS_MPa") plt.show() # Violin plot — box + full distribution shape sns.violinplot(data=df, x="material", y="UTS_MPa", inner="box") plt.show() # Strip + box combined (show individual points) fig, ax = plt.subplots(figsize=(10, 5)) sns.boxplot(data=df, x="material", y="UTS_MPa", ax=ax, fliersize=0) sns.stripplot(data=df, x="material", y="UTS_MPa", ax=ax, alpha=0.4, jitter=True, color="steelblue") plt.show()
import seaborn as sns import matplotlib.pyplot as plt # Correlation matrix of all numeric columns corr = df.select_dtypes("number").corr() fig, ax = plt.subplots(figsize=(10, 8)) sns.heatmap(corr, annot=True, # show correlation values fmt=".2f", cmap="coolwarm", # red=positive, blue=negative center=0, square=True, linewidths=0.5, ax=ax) ax.set_title("Parameter Correlation Matrix") plt.tight_layout(); plt.show() # Mask the upper triangle (remove redundancy) import numpy as np mask = np.triu(np.ones_like(corr, dtype=bool)) sns.heatmap(corr, mask=mask, annot=True, fmt=".2f", cmap="coolwarm", center=0) plt.show()
# Pair plot: every variable plotted against every other # Diagonal shows each variable's distribution sns.pairplot(df[["UTS_MPa", "E_GPa", "elongation", "hardness"]], diag_kind="kde") plt.show() # Pair plot coloured by category sns.pairplot(df, hue="material", vars=["UTS_MPa", "E_GPa", "elongation"]) plt.show() # Scatter with regression line + confidence interval sns.regplot(data=df, x="E_GPa", y="UTS_MPa", scatter_kws={"alpha": 0.5}, line_kws={"color": "red"}) plt.show() # lmplot — regression per category sns.lmplot(data=df, x="E_GPa", y="UTS_MPa", hue="material", height=5, aspect=1.5) plt.show()
Dash & Streamlit — Data Apps
A dashboard turns your analysis script into something non-coders can use. Upload a CSV, change a parameter, click a button: and the plots update. Streamlit is the fastest path to a working app. Dash gives you more control for production tools.
pip install streamlit. Run with: streamlit run app.py. Opens a live browser tab that reloads every time you save the file.import streamlit as st import pandas as pd import plotly.express as px st.title("Flight Data Analyser") # File uploader — drag and drop CSV uploaded = st.file_uploader("Upload a CSV file", type=["csv"]) if uploaded: df = pd.read_csv(uploaded) st.write(f"Loaded {len(df):,} rows") # Sidebar controls cols = df.select_dtypes("number").columns.tolist() x_col = st.sidebar.selectbox("X axis", cols) y_col = st.sidebar.selectbox("Y axis", cols, index=1) plot_type= st.sidebar.radio("Plot type", ["Line", "Scatter"]) # Filter by altitude range if "altitude_ft" in df.columns: min_alt, max_alt = int(df["altitude_ft"].min()), int(df["altitude_ft"].max()) alt_range = st.slider("Altitude range (ft)", min_alt, max_alt, (min_alt, max_alt)) df = df[df["altitude_ft"].between(*alt_range)] # Plot if plot_type == "Line": fig = px.line(df, x=x_col, y=y_col) else: fig = px.scatter(df, x=x_col, y=y_col, opacity=0.5) st.plotly_chart(fig, use_container_width=True) # Statistics st.subheader("Summary statistics") st.dataframe(df[[x_col, y_col]].describe().round(3))
# 1. Create requirements.txt # streamlit # pandas # plotly # numpy # 2. Push to GitHub (public or private repo) # 3. Go to share.streamlit.io # Connect GitHub → select repo → select app.py → Deploy # Free tier gives you a public URL: yourapp.streamlit.app # Run locally # streamlit run app.py
from dash import Dash, dcc, html, Input, Output import plotly.express as px import pandas as pd df = pd.read_csv("flight_data.csv") app = Dash(__name__) app.layout = html.Div([ html.H1("Flight Data Dashboard"), dcc.Dropdown( id="y-param", options=[{"label": c, "value": c} for c in df.select_dtypes("number").columns], value="altitude_ft" ), dcc.Graph(id="main-chart") ]) @app.callback( Output("main-chart", "figure"), Input("y-param", "value") ) def update_chart(y_col): return px.line(df, x="time", y=y_col, title=y_col) if __name__ == "__main__": app.run(debug=True) # Visit: http://localhost:8050
Streamlit: you need something working today, the UI can be simple, internal use.
Dash: you need precise layout control, complex interactivity, or it's going to external users.
Python & C++ Cheat Sheet
Every essential concept from the course, side by side. Bookmark this page. Come back to it whenever you're coding and can't remember the exact syntax.
| Concept | Python | C++ | Notes |
|---|---|---|---|
| PRINTING | |||
| Print text | print("Hello") | cout << "Hello" << endl; | endl = new line |
| Print variable | print(x) | cout << x << endl; | |
| Print + variable | print(f"Hi {name}") | cout << "Hi " << name; | f-string vs chain |
| VARIABLES | |||
| Integer | x = 10 | int x = 10; | Whole numbers |
| Decimal | x = 3.14 | double x = 3.14; | Decimal numbers |
| Text | x = "hello" | string x = "hello"; | Always in quotes |
| True/False | x = True | bool x = true; | Lowercase in C++ |
| MATHS | |||
| Add | a + b | a + b | Same in both |
| Subtract | a - b | a - b | Same in both |
| Multiply | a * b | a * b | Same in both |
| Divide | a / b | a / b | C++: use double for decimals |
| Remainder | a % b | a % b | Same in both |
| Power | a ** b | pow(a, b) | Need <cmath> in C++ |
| Add to self | x += 5 | x += 5; | Same in both |
| USER INPUT | |||
| Read text | x = input("msg") | cin >> x; | |
| Read number | x = int(input("msg")) | int x; cin >> x; | Python: must convert |
| IF / ELSE | |||
| If | if x > 5: | if (x > 5) { | Indentation vs braces |
| Else if | elif x == 5: | } else if (x == 5) { | elif vs else if |
| Else | else: | } else { | |
| Equals | x == y | x == y | == not = (= assigns) |
| Not equals | x != y | x != y | Same in both |
| And | x > 0 and x < 10 | x > 0 && x < 10 | and vs && |
| Or | x < 0 or x > 10 | x < 0 || x > 10 | or vs || |
| LOOPS | |||
| For (range) | for i in range(5): | for(int i=0; i<5; i++){ | 0 to 4 |
| For (range 1-5) | for i in range(1,6): | for(int i=1; i<=5; i++){ | 1 to 5 |
| For (list) | for x in myList: | for(type x : myArray){ | For-each |
| While | while x < 10: | while (x < 10) { | Same logic |
| Increment | i += 1 | i++ or i += 1 | |
| FUNCTIONS | |||
| Define (no return) | def greet(): | void greet() { | void = no return |
| Define (with param) | def greet(name): | void greet(string name) { | C++: must type params |
| Return a value | return x * 2 | return x * 2; | Replace void with type |
| Call a function | greet("Alice") | greet("Alice"); | Same syntax |
| LISTS / ARRAYS | |||
| Create | x = [1, 2, 3] | int x[] = {1, 2, 3}; | |
| Access item | x[0] | x[0] | Index starts at 0 |
| Change item | x[0] = 99 | x[0] = 99; | Same |
| Length | len(x) | Manual count | No built-in in C++ arrays |
| Add item | x.append(4) | Use vector<int> | C++ arrays are fixed size |
| MATRICES | |||
| Create 3x3 | m = [[1,2,3],[4,5,6],[7,8,9]] | int m[3][3] = {{1,2,3},{4,5,6},{7,8,9}}; | |
| Access cell | m[row][col] | m[row][col] | Same in both |
| FILE SETUP (C++ only) | |||
| For print/input | (not needed) | #include <iostream> | Always at top |
| For strings | (not needed) | #include <string> | Needed for string type |
| Namespace | (not needed) | using namespace std; | Avoids writing std:: |
| Main function | (not needed) | int main() { ... return 0; } | Every C++ program needs this |
1. In Python: indentation matters. 4 spaces = belongs to this block.
2. In C++: every statement ends with a semicolon
;3. In C++: every block of code goes inside
{ } braces.4. Both languages: == means "compare", = means "assign". Don't mix them up.
5. Both languages: arrays start at index 0, not 1.