Lesson 01 / Hello, World
Python C++
shecodeslab.com

SheCodes
Lab.

Python and C++, written side by side. Built by an aerospace engineer for engineers and developers who need to actually use the language.

🌱
Never coded before
Start with what programming actually is, then write your first line.
Know some basics
Jump to Lesson 1: Python and C++ side by side from Hello World.
🛠️
I'm an engineer
NumPy, pandas, matplotlib, ISA models, cost index, FDR analysis.
📋
Just the reference
Cheat sheets for NumPy, pandas, matplotlib, C++, and common errors.
45+
lessons
8
modules
2
languages
Built by Noor Keshaish
Aerospace & Aeronautical Engineer
Before You Code · 01

What is Programming?

Before writing any code, you need one mental model. This page gives you that model.

SECTION B1.1
A computer is just a very obedient machine
A computer will do exactly what you tell it. Not what you mean. Not what makes sense. Exactly, literally, precisely what you write. This is the most important thing to understand about programming.
🤖
Imagine a robot with no common sense. You tell it "get me a glass of water." It stares at you. It doesn't know what "get", "me", "glass", or "water" means. But if you say "walk to kitchen, open cabinet, take glass, open tap, hold glass under tap, close tap, walk back" . It does it perfectly, every time. Programming is writing those exact instructions.
A program is nothing more than a list of instructions, written in a language the computer understands. The computer reads them one by one, from top to bottom, and follows them precisely. No guessing. No shortcuts. No common sense.
SECTION B1.2
How a program actually runs
Here's what happens when you write and run code. Every programmer goes through exactly these steps:
✍️
Step 1
You write code
In a text editor or IDE on your computer
🔍
Step 2
It's checked
Python/C++ reads your code and checks for mistakes
⚙️
Step 3
It runs
The computer executes each instruction, top to bottom
📺
Step 4
You see output
Results appear in the terminal or a window
When something goes wrong . The computer doesn't "almost" run your code. It either runs perfectly, or it stops and gives you an error message. An error message is the computer telling you precisely what it couldn't process. Reading them accurately is a core skill.
SECTION B1.3
What is a programming language?
Computers only understand one thing: electrical signals, which are ultimately just 0s and 1s. Programming languages are a human-readable way to give instructions that get translated into those 0s and 1s. You write in Python or C++. The computer gets binary. You never see the binary.
PYTHON
Reads almost like English. Designed to be easy to learn and easy to read. The computer figures out a lot of details for you. Great for beginners, data science, automation, web development.
print("Hello!")
C++
More explicit and more powerful. You tell the computer everything. Faster to execute. Used in games, aerospace systems, embedded hardware, high-performance applications.
cout << "Hello!";
SECTION B1.4
What can you build with code?
What you want to buildLanguage to use
Websites and web appsPython (backend), JavaScript (frontend)
Data analysis, graphs, reportsPython (pandas, matplotlib)
GamesC++ (Unreal Engine), C# (Unity)
Engineering simulations, FEAC++, Python (NumPy/SciPy)
Mobile apps (iPhone/Android)Swift (iPhone), Kotlin (Android)
Machine learning / AIPython (TensorFlow, PyTorch)
Embedded systems / microcontrollersC++, C
Automate repetitive computer tasksPython
Before You Code · 02

Setting Up Your Computer

Two things to install: Python and an editor. Follow these steps.

SECTION B2.1
Install Python
Python is the program that reads and runs your code. You only install it once.
🍎
Mac
1. Go to python.org/downloads
2. Click the yellow "Download Python" button
3. Open the downloaded file and follow the installer
4. Done.
🪟
Windows
1. Go to python.org/downloads
2. Download the installer
3. Open it: tick "Add Python to PATH" before clicking Install
4. Done.
🐧
Linux
Python is usually pre-installed. Check:
python3 --version

If not: sudo apt install python3
Check it worked. Open your terminal (on Mac: search "Terminal"; on Windows: search "Command Prompt") and type: python --version then press Enter. You should see something like Python 3.12.0. If you do, Python is installed correctly.
SECTION B2.2
What is a terminal?
A terminal (also called command prompt or shell) is a text-based window where you type commands directly to your computer. You only need a handful of commands.
Terminal commands
# Find out where you are (your current folder)
pwd              # Mac/Linux
cd               # Windows (just typing cd shows your location)

# List files in the current folder
ls               # Mac/Linux
dir              # Windows

# Move into a folder
cd Documents
cd Documents/my_code

# Go back up one folder
cd ..

# Run a Python file
python hello.py
python3 hello.py  # if the above doesn't work

# Install a Python library
pip install numpy
pip install pandas matplotlib scipy
SECTION B2.3
Pick an editor
A code editor is like Word, but for code. It colours your text to make it readable, catches obvious mistakes, and lets you run your program. There are many options. Pick one and stick with it.
EditorBest forDownload
VS Code The most widely used editor. Works for Python, C++, and everything else. code.visualstudio.com
PyCharm Community Python-specific. More features for Python, slightly heavier. jetbrains.com/pycharm
Jupyter Notebook Data analysis and engineering. Run code block by block, see results inline. pip install notebook
Thonny Total beginners. Simple interface, shows you exactly what each line does. thonny.org
Recommendation: install VS Code. It works for Python, C++, and everything else you'll ever learn. Install the "Python" extension from the Extensions panel on the left sidebar: that gives you syntax highlighting, error detection, and a built-in terminal.
SECTION B2.4
Your first Python file
The complete workflow from a blank file to a running program:
Your first program
# 1. Open VS Code
# 2. File → New File → save it as:  hello.py
#    (the .py tells the computer it's Python)

# 3. Type this into the file:
print("Hello! I am learning to code.")
print("This is my first program.")

# 4. Open the terminal in VS Code: View → Terminal
# 5. Type:  python hello.py  and press Enter
# 6. You should see:
#    Hello! I am learning to code.
#    This is my first program.

# Congratulations — you just ran your first program.
Before You Code · 03

How Computers Think

You don't need to know electronics. But understanding how a computer works changes how you read error messages, write loops, and think about memory.

SECTION B3.1
Memory: where your program lives while it runs
When you run a program, the computer loads it into RAM (memory). Every variable you create is stored somewhere in that memory. When the program finishes, everything is gone: like closing a document without saving.
📋
Think of memory like a whiteboard. While you're working, you write things on it: calculations, names, results. When you're done and erase it, everything's gone. If you want to keep something permanently, you save it to a file (the hard drive): that's like writing in a notebook.
What happens in memory when you write: age = 22
age = 22
0x4A2F
...
age
22
0x4A31
...
0x4A32
...
Python finds a free slot in memory, stores the value 22 there, and labels that slot age. Every time you write age, Python looks up that slot and gives you what's inside.
SECTION B3.2
How a loop works
Tracing: for i in range(4): print(i)
Round 1
i = 0
0 < 4 → run body
prints: 0
Round 2
i = 1
1 < 4 → run body
prints: 1
Round 3
i = 2
2 < 4 → run body
prints: 2
Round 4
i = 3
3 < 4 → run body
prints: 3
When i = 4:   4 < 4 is False → loop ends.
SECTION B3.3
Compiled vs Interpreted
Python: Interpreted
Your code
→ Python reads it live
Line 1
Executed immediately
Line 2
Executed immediately
Line 3
Executed immediately
Slower to run, but easier to write and test. Errors show up when that specific line runs.
C++: Compiled
Your code
→ Compiler reads ALL of it
Compile
Checks everything, builds .exe
Run
Executes machine code directly
Fast
10–100× faster than Python
More setup but errors are caught before running. Produces faster programs.
You're ready to start coding. You now know what a program is, what happens when it runs, how memory works, and how loops work. The rest is just syntax: the exact words and symbols each language uses. Everything in the next 16 lessons builds on exactly what you just learned.
Reference

Glossary: Every Term Explained

Every technical term used on this site. Searchable, grouped by topic.

TermPlain English explanation
AlgorithmA step-by-step method for solving a problem. A recipe is an algorithm. A sorting method is an algorithm.
ArgumentA value you pass into a function when calling it. In print("hello"), the argument is "hello".
ArrayAn ordered collection of values stored together, accessed by number index. Index starts at 0.
BooleanA value that is either True or False. Used in conditions: if x > 5 evaluates to True or False.
BugA mistake in code that causes it to behave incorrectly. Named after a real moth found in a computer in 1947.
ClassA blueprint for creating objects. Defines what data they hold and what they can do.
CompileTranslate your entire source code into machine code before running. C++ is compiled. Python is interpreted.
ConditionAn expression that evaluates to True or False, used to make decisions. age >= 18 is a condition.
CrashWhen a program stops unexpectedly due to an error it couldn't handle.
Data typeThe kind of value a variable holds: integer, float, string, boolean. Determines what you can do with it.
DebugFinding and fixing errors (bugs) in code. One of the most important and time-consuming parts of programming.
DictionaryA collection of key-value pairs. Like a real dictionary: look up a word (key) to get its definition (value).
ExceptionAn error that occurs while the program is running. Can be caught with try/except (Python) or try/catch (C++).
FloatA number with a decimal point. 3.14, -0.5, 1.0 are all floats.
FunctionA named block of code that does one job. You define it once and call it by name whenever needed.
IDEIntegrated Development Environment: a code editor with extra tools like error highlighting and a debugger. VS Code, PyCharm.
IndexThe position number of an item in a list or array. Starts at 0, not 1. list[0] is the first item.
IntegerA whole number with no decimal point. 1, -5, 1000 are integers.
InterpretRead and execute code one line at a time as the program runs. Python is interpreted. Slower but easier to use.
LibraryA collection of pre-written code you can import and use. import numpy loads thousands of maths functions someone else wrote.
LoopA block of code that repeats. for loops repeat a set number of times. while loops repeat until a condition is false.
MethodA function that belongs to an object or class. Called with a dot: text.upper() calls the upper method on a string.
NaN"Not a Number": a special floating-point value that appears when maths produces an undefined result. Poisons downstream calculations.
None / nullA special value meaning "nothing" or "no value". Python uses None, C++ uses nullptr (for pointers) or 0.
ObjectAn instance of a class. A specific thing created from a blueprint. alice = Person("Alice", 22): alice is an object.
ParameterA variable in a function definition that receives a value when the function is called. def greet(name):: name is a parameter.
PointerC++ only. A variable that stores the memory address of another variable. Powerful but dangerous if misused.
PrintDisplay output to the terminal/console. The first thing every programmer learns. print() in Python, cout in C++.
RAMRandom Access Memory: fast, temporary storage where your program and all its variables live while running. Cleared when program ends.
ReturnSend a value back from a function. return x * 2 gives the result back to wherever the function was called.
Runtime errorAn error that happens while the program is running, not when compiling. Division by zero, accessing a missing index.
ScopeWhere a variable is visible. Local scope = inside one function. Global scope = everywhere in the file.
SemicolonThe ; character required at the end of every statement in C++. Forgetting it is the most common C++ beginner error.
StringText data: any sequence of characters wrapped in quotes. "Hello", "42", "Noor" are all strings.
SyntaxThe grammar rules of a programming language. Every language has its own syntax: the exact symbols and words it understands.
Syntax errorYou wrote something the language doesn't understand. Often a missing colon, bracket, or quote. Fixed before the program runs.
TerminalA text-based window where you type commands directly to your computer. Also called: command prompt, shell, console.
VariableA named container in memory that holds a value. age = 22 creates a variable called age holding the value 22.
VectorIn C++: a dynamic array that can grow in size. vector<int> v;. In maths: a quantity with magnitude and direction.
voidC++ only. Means "this function returns nothing." Used when a function just does something rather than calculating a result.
Lesson 01 · Module 1

Hello, World

Every single programmer in history started with this program. It does one thing: print "Hello, World!" on the screen. It sounds trivial: but it teaches you the most important skill in coding: getting your first program to actually run.

SECTION 1.1
What is a program?
A program is a list of instructions you give to a computer. The computer follows them one by one, from top to bottom, exactly as written. No common sense. Just precision. If you write something the computer doesn't understand, it stops and gives you an error.
💡
Think of it like a recipe. You write every step in order. The computer is the kitchen: it follows the recipe exactly, never skipping a step, never guessing what you meant. Your job as a programmer is to write clear, unambiguous instructions.
SECTION 1.2
Your first program in both languages
Below is the same program written in Python and C++. They do the exact same thing. Look at how different they are: and we'll explain every piece.
Python
hello.py
print("Hello, World!")
C++
hello.cpp
#include <iostream>
using namespace std;

int main() {
    cout << "Hello, World!" << endl;
    return 0;
}
Python: line by line:
print() is a built-in command. Whatever you put inside the parentheses in quotes gets printed to the screen. That's the entire program. One line.

C++: line by line:
#include <iostream>: loads the input/output tools. Without this, C++ can't print anything.
using namespace std;: lets you write cout instead of std::cout every time.
int main() { }: every C++ program must have a main() function. This is where your code runs.
cout << "Hello, World!" << endl;: cout sends text to the screen. endl moves to the next line.
return 0;: tells the computer "program finished successfully."
Why does C++ need so many lines? Because C++ gives you more control and speed. The setup is more explicit. Python hides a lot of this for you to keep things simple. They're designed for different situations.
SECTION 1.3
Print multiple lines
You can print as many lines as you want. Just call the print command again on a new line.
Python
print("Hello, World!")
print("I am learning to code.")
print("This is my first program.")
C++
int main() {
    cout << "Hello, World!" << endl;
    cout << "I am learning to code." << endl;
    cout << "This is my first program." << endl;
    return 0;
}
Note: endl in C++ means end of line. It moves the cursor to the next line. Python's print() does this automatically after every call.
Try it yourself
Edit the code, then press Run
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 1
In Python, which command prints text to the screen?
A cout <<
B print()
C display()
D echo()
Lesson 02 · Module 1

Variables & Data Types

A variable is a named location in memory. Data types define what kind of value it holds and what operations are valid on it.

SECTION 2.1
What is a variable?
📦
Think of a labelled box. You write "age" on the outside and put the number 22 inside. Later, whenever you need that number, you just say "age" and the computer opens the box and gives you what's inside.
In Python you create a variable by writing: name = value. In C++ you must first declare what type of value the variable holds before you store anything.
Python
# No type needed — Python figures it out
name      = "Alice"
age       = 22
height    = 1.68
is_coder  = True

print(name)
print(age)
print(height)
print(is_coder)
C++
// Must declare the type first
string  name     = "Alice";
int     age      = 22;
double  height   = 1.68;
bool    isCoder  = true;

cout << name    << endl;
cout << age     << endl;
cout << height  << endl;
cout << isCoder << endl;
SECTION 2.2
The 4 essential data types
These four types cover almost everything you'll need as a beginner. Every value in your program is one of these.
TypeWhat it storesPythonC++
IntegerWhole numbers, no decimal pointage = 22int age = 22;
Float / DoubleNumbers with a decimal pointheight = 1.68double height = 1.68;
StringText, always in quotesname = "Alice"string name = "Alice";
BooleanTrue or False, two values onlydone = Truebool done = true;
Python vs C++ typing: Python is dynamically typed: you don't declare types, Python figures it out when the code runs. C++ is statically typed: you must declare the type upfront, and the compiler checks it before the program even runs. C++ catches type errors before they become bugs at runtime.
SECTION 2.3
Using variables inside printed text
You can combine variables with regular text when printing. In Python the cleanest way is an f-string: put the variable name inside {} curly braces. In C++ you chain values with <<.
Python
name = "Alice"
age  = 22

# f-string: put variable in {}
print(f"My name is {name}.")
print(f"I am {age} years old.")
print(f"In 5 years I will be {age + 5}.")
C++
string name = "Alice";
int    age  = 22;

// Chain with <<
cout << "My name is " << name << "." << endl;
cout << "I am " << age << " years old." << endl;
cout << "In 5 years: " << age+5 << endl;
SECTION 2.4
Basic arithmetic
Variables storing numbers can be used in mathematical expressions. The operators are the same in both languages.
OperatorMeaningExampleResult
+Addition10 + 313
-Subtraction10 - 37
*Multiplication10 * 330
/Division10 / 33.333...
%Remainder (modulo)10 % 31
**Power (Python only)2 ** 8256
C++ integer division warning: In C++, dividing two int values gives an int result: 10 / 3 gives 3, not 3.333. For decimal results, use double for at least one operand: 10.0 / 3 = 3.333.
Try it yourself
Change the name, age, and city. Then hit Run.
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 2
Which data type would you use to store a person's name (e.g. "Noor")?
A int
B bool
C string
D double
Lesson 03 · Module 1

Getting Input from the User

Input lets a program respond to what a user provides at runtime. Here's how both languages handle it.

SECTION 3.1
Reading text from the user
In Python, input() pauses the program, shows a message, and waits. Whatever the user types gets stored in a variable. In C++, cin >> reads what the user typed.
Python
name = input("What is your name? ")
print(f"Hello, {name}! Welcome.")
C++
string name;
cout << "What is your name? ";
cin  >> name;
cout << "Hello, " << name << "! Welcome." << endl;
SECTION 3.2
Reading a number from the user
Important: In Python, input() always returns text, even if the user types a number. You must convert it using int() or float(), otherwise you can't do maths with it. C++ handles this automatically if you declare the right type.
Python
# Convert to int so we can do maths
age = int(input("How old are you? "))
print(f"In 10 years you will be {age + 10}.")

# float() for decimal numbers
price = float(input("Price: "))
print(f"With 10% tax: {price * 1.1:.2f}")
C++
// Declare as int — cin reads a number
int age;
cout << "How old are you? ";
cin  >> age;
cout << "In 10 years: " << age+10 << endl;

double price;
cout << "Price: ";
cin  >> price;
cout << "With tax: " << price*1.1 << endl;
Most common beginner mistake in Python: Forgetting to wrap input() with int(). If you do age = input("Age: ") and the user types 25, then age + 10 gives you an error because you can't add a number to text.
Try it yourself
Simulated: replace the values in quotes to test different inputs
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 3
In Python: age = input("Age: "). The user types 30. What type is age?
A int
B str: input() always returns text
C float
D bool
Lesson 04 · Module 1

Making Decisions: If / Else

Control flow determines which code runs based on conditions. This is how programs branch.

SECTION 4.1
Basic if / else
🚦
Think of a traffic light. If the light is green → go. Else → stop. Your program checks whether a condition is true or false, then takes a different path based on the answer.
Python
age = 18

if age >= 18:
    print("You can vote.")
else:
    print("Too young to vote.")
C++
int age = 18;

if (age >= 18) {
    cout << "You can vote." << endl;
} else {
    cout << "Too young to vote." << endl;
}
Python uses indentation. C++ uses { } braces. In Python, the 4 spaces before print tell Python "this line belongs to the if." In C++, the curly braces do the same job. Miss either and your code breaks.
SECTION 4.2
Multiple conditions: elif and else if
When you have more than two possibilities, chain conditions with elif (Python) or else if (C++). Only the first true condition runs.
Python
score = 75

if score >= 90:
    print("Grade: A")
elif score >= 70:
    print("Grade: B")
elif score >= 50:
    print("Grade: C")
else:
    print("Grade: F")
C++
int score = 75;

if (score >= 90) {
    cout << "Grade: A" << endl;
} else if (score >= 70) {
    cout << "Grade: B" << endl;
} else if (score >= 50) {
    cout << "Grade: C" << endl;
} else {
    cout << "Grade: F" << endl;
}
SECTION 4.3
Comparison operators
OperatorMeaningTrue exampleFalse example
==Equal to5 == 55 == 6
!=Not equal to5 != 65 != 5
>Greater than6 > 55 > 6
<Less than5 < 66 < 5
>=Greater than or equal5 >= 54 >= 5
<=Less than or equal5 <= 56 <= 5
Try it yourself
Change the score value: try 95, 65, 40
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 4
What is elif in Python equivalent to in C++?
A else { }
B else if ( )
C elseif ( )
D switch ( )
Lesson 05 · Module 1

Loops: Repeat with Purpose

A loop executes a block of code repeatedly. You define the condition or count: the loop handles the rest.

SECTION 5.1
The for loop: repeat a set number of times
🔄
Think of attendance. "For each student in the class, call their name." You repeat the same action: calling a name: for each item in a list, one by one.
Python
# Print 0, 1, 2, 3, 4
for i in range(5):
    print(i)

# Print 1 to 5
for i in range(1, 6):
    print(i)
C++
// Print 0, 1, 2, 3, 4
for (int i = 0; i < 5; i++) {
    cout << i << endl;
}

// Print 1 to 5
for (int i = 1; i <= 5; i++) {
    cout << i << endl;
}
The C++ for loop has 3 parts separated by semicolons:
int i = 0: start: create a counter variable set to 0
i < 5: condition: keep looping while this is true
i++: step: add 1 to the counter after each loop (i++ means i = i + 1)
SECTION 5.2
The while loop: repeat until a condition is false
A while loop keeps running as long as a condition stays true. Use it when you don't know in advance exactly how many times you need to loop.
Python
count = 1

while count <= 5:
    print(f"Count is {count}")
    count += 1   # count = count + 1

print("Done!")
C++
int count = 1;

while (count <= 5) {
    cout << "Count is " << count << endl;
    count++;
}

cout << "Done!" << endl;
Always make sure the loop stops. If the condition is always true: for example you forget to add count++: the loop runs forever. This is called an infinite loop and it freezes your program.
SECTION 5.3
Loop through a list of items
Python
fruits = ["apple", "banana", "cherry"]

for fruit in fruits:
    print(fruit)
C++
string fruits[] = {"apple", "banana", "cherry"};

for (string fruit : fruits) {
    cout << fruit << endl;
}
Try it yourself
Change the range or the list: see what prints
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 5
How many times does this run?  for i in range(4):
A 3 times
B 4 times (i = 0, 1, 2, 3)
C 5 times
D Infinite times
Lesson 06 · Module 1

Functions: Write Once, Use Anywhere

A function is a named, reusable block of code. Define it once, call it wherever needed.

SECTION 6.1
Defining and calling a function
📋
Think of a function like a saved recipe. You write the recipe once and give it a name. Every time you want that dish, you just say its name: you don't rewrite the entire recipe.
Python
# Define the function
def greet():
    print("Hello! Welcome.")

# Call it (can call as many times as you want)
greet()
greet()
C++
// Define the function (void = returns nothing)
void greet() {
    cout << "Hello! Welcome." << endl;
}

int main() {
    greet();  // Call it
    greet();
    return 0;
}
SECTION 6.2
Functions with parameters: passing information in
Parameters let you send information into a function so it can work with different values each time it's called.
Python
def greet(name):
    print(f"Hello, {name}!")

greet("Alice")   # Hello, Alice!
greet("Bob")     # Hello, Bob!

def add(a, b):
    print(a + b)

add(5, 3)   # 8
add(10, 20) # 30
C++
void greet(string name) {
    cout << "Hello, " << name << "!" << endl;
}

void add(int a, int b) {
    cout << a + b << endl;
}

int main() {
    greet("Alice");
    greet("Bob");
    add(5, 3);
    add(10, 20);
    return 0;
}
SECTION 6.3
Functions that return a value
Instead of just printing, a function can calculate and give back a result. You store that result in a variable and use it later. This is called a return value.
Python
def square(n):
    return n * n

result = square(5)
print(result)       # 25
print(square(8))   # 64

def full_name(first, last):
    return first + " " + last

print(full_name("Ada", "Lovelace"))
C++
int square(int n) {
    return n * n;
}

string fullName(string f, string l) {
    return f + " " + l;
}

int main() {
    cout << square(5) << endl;   // 25
    cout << square(8) << endl;   // 64
    cout << fullName("Ada", "Lovelace") << endl;
    return 0;
}
In C++: Replace void with the return type: int if the function returns a whole number, string if it returns text, double if it returns a decimal.
Try it yourself
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 6
What keyword sends a value back from a function?
A print
B send
C return
D output
Lesson 07 · Module 2

Arrays & Lists

A list (Python) or array (C++) stores multiple values in a single container, accessed by index.

SECTION 7.1
Creating and accessing a list / array
🗂️
Think of a numbered shelf. Slot 0 holds the first item, slot 1 holds the second, and so on. You grab any item instantly by saying its slot number. Counting always starts at 0, not 1: this trips up every beginner at first.
Python
names = ["Alice", "Bob", "Clara"]

print(names[0])    # Alice  (first item)
print(names[1])    # Bob    (second item)
print(names[2])    # Clara  (third item)
print(len(names))  # 3      (total count)

scores = [88, 92, 76, 95]
print(scores[0])   # 88
C++
string names[] = {"Alice", "Bob", "Clara"};

cout << names[0] << endl;  // Alice
cout << names[1] << endl;  // Bob
cout << names[2] << endl;  // Clara

int scores[] = {88, 92, 76, 95};
cout << scores[0] << endl; // 88
SECTION 7.2
Looping through a list
Python
scores = [88, 92, 76, 95]
total = 0

for s in scores:
    total += s

average = total / len(scores)
print(f"Total:   {total}")
print(f"Average: {average}")
C++
int scores[] = {88, 92, 76, 95};
int total = 0;

for (int s : scores) {
    total += s;
}

double avg = total / 4.0;
cout << "Total:   " << total << endl;
cout << "Average: " << avg   << endl;
Try it yourself
Change the grades and see the total and average recalculate
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 7
Given x = [10, 20, 30], what is x[2]?
A 10
B 20
C 30: index starts at 0
D Error
Lesson 08 · Module 2

Matrices: Grids of Data

A matrix is a two-dimensional array: rows and columns. Each element is addressed by [row][column].

SECTION 8.1
Creating a matrix
📊
Think of a seating chart. Row 0 is the front row, Row 1 is the second row. Seat 0 is the leftmost seat. Student at row 1, seat 2 is in the second row, third seat from the left: accessed as [1][2].
Python
# 3 rows, 3 columns
matrix = [
    [1, 2, 3],   # row 0
    [4, 5, 6],   # row 1
    [7, 8, 9]    # row 2
]

print(matrix[0][0])  # 1  (row 0, col 0)
print(matrix[1][2])  # 6  (row 1, col 2)
print(matrix[2][1])  # 8  (row 2, col 1)
C++
// 3 rows, 3 columns
int matrix[3][3] = {
    {1, 2, 3},   // row 0
    {4, 5, 6},   // row 1
    {7, 8, 9}    // row 2
};

cout << matrix[0][0] << endl; // 1
cout << matrix[1][2] << endl; // 6
cout << matrix[2][1] << endl; // 8
SECTION 8.2
Printing a full matrix with nested loops
To visit every cell in a matrix, use a loop inside a loop. The outer loop goes through each row. The inner loop goes through each column in that row.
Python
matrix = [[1,2,3],[4,5,6],[7,8,9]]

for row in matrix:
    for val in row:
        print(val, end=" ")
    print()  # new line after each row
C++
int m[3][3] = {{1,2,3},{4,5,6},{7,8,9}};

for (int r = 0; r < 3; r++) {
    for (int c = 0; c < 3; c++) {
        cout << m[r][c] << " ";
    }
    cout << endl;
}
Output:
1 2 3
4 5 6
7 8 9
Try it yourself
Change the numbers in the matrix and see the grid print
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 8
In a matrix, how do you access row 2, column 0?
A matrix[0][2]
B matrix[2][0]
C matrix(2, 0)
D matrix[2, 0]
Lesson 09 · Module 2

Strings in Depth

Strings have built-in methods for searching, slicing, and transforming text. These come up constantly in practical work.

SECTION 9.1
String length and accessing characters
Every character in a string has an index, just like a list. The first character is at index 0. You can also get the length: how many characters are in the string.
Python
name = "Alice"

print(len(name))    # 5
print(name[0])     # A  (first char)
print(name[4])     # e  (last char)
print(name[-1])    # e  (last char shortcut)
C++
string name = "Alice";

cout << name.length() << endl; // 5
cout << name[0]        << endl; // A
cout << name[4]        << endl; // e
SECTION 9.2
Slicing: extracting part of a string
You can pull out a chunk of a string by specifying a start and end index. This is called slicing, and it's one of the most useful string operations.
Python
text = "Hello, World!"

print(text[0:5])   # Hello   (index 0 to 4)
print(text[7:])    # World!  (index 7 to end)
print(text[:5])    # Hello   (start to index 4)
print(text[-6:])   # World!  (last 6 chars)
C++
string text = "Hello, World!";

// substr(start, length)
cout << text.substr(0, 5) << endl; // Hello
cout << text.substr(7)    << endl; // World!
cout << text.substr(7, 5) << endl; // World
Python slicing syntax: text[start:end]: the end index is NOT included. So text[0:5] gives characters at positions 0, 1, 2, 3, 4. C++'s substr(start, length) takes a start position and a character count instead.
SECTION 9.3
Essential string methods
Strings come with built-in methods: functions that belong to the string itself. You call them with a dot: text.upper(). These are identical in concept across both languages.
Python
text = "  Hello, World!  "

print(text.upper())         # HELLO, WORLD!
print(text.lower())         # hello, world!
print(text.strip())         # Hello, World!  (no spaces)
print(text.replace("World", "Python"))
# Hello, Python!

words = "one,two,three"
print(words.split(","))     # ['one', 'two', 'three']
print("ell" in text)         # True (contains check)
C++
#include <algorithm>
string text = "Hello, World!";

// to uppercase
string up = text;
transform(up.begin(), up.end(), up.begin(), toupper);
cout << up << endl;

// find a substring
int pos = text.find("World");
cout << pos << endl;   // 7

// check if contains
if (text.find("World") != string::npos)
    cout << "Found!" << endl;
Python is much friendlier for string work. C++ strings are more verbose. In real C++ projects, many developers use helper libraries. But knowing these basics handles 90% of string tasks.
SECTION 9.4
String concatenation and conversion
Joining strings together is called concatenation. You also often need to convert numbers to strings and back.
Python
# Joining strings
first = "Ada"
last  = "Lovelace"
full  = first + " " + last
print(full)          # Ada Lovelace

# Number to string
age = 25
msg = "Age: " + str(age)
print(msg)           # Age: 25

# String to number
num = int("42")
print(num + 8)       # 50
C++
// Joining strings
string first = "Ada";
string last  = "Lovelace";
string full  = first + " " + last;
cout << full << endl;   // Ada Lovelace

// Number to string
int age = 25;
string msg = "Age: " + to_string(age);
cout << msg << endl;    // Age: 25

// String to number
int num = stoi("42");
cout << num + 8 << endl; // 50
Try it yourself
Experiment with string methods
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 9
In Python, what does "hello".upper() return?
A "Hello"
B "HELLO"
C "hello"
D 5
Lesson 10 · Module 2

Dictionaries & Maps

A dictionary (Python) or map (C++) stores key-value pairs. Access by key rather than by numeric index.

SECTION 10.1
Creating and accessing a dictionary
📖
Think of a physical dictionary. You look up a word (the key) and get its definition (the value). You don't need to know which page number it's on: you just use the word to find what you need.
Python
# Create a dictionary
person = {
    "name": "Alice",
    "age": 22,
    "city": "Doha"
}

# Access by key
print(person["name"])   # Alice
print(person["age"])    # 22

# Add or update
person["email"] = "[email protected]"
person["age"]   = 23

print(person)
C++
#include <map>

map<string, string> person;

person["name"] = "Alice";
person["city"] = "Doha";

cout << person["name"] << endl; // Alice
cout << person["city"] << endl; // Doha

// Map with int values
map<string, int> scores;
scores["Alice"] = 95;
scores["Bob"]   = 87;
SECTION 10.2
Looping through a dictionary
Python
grades = {
    "Alice": 95,
    "Bob":   87,
    "Clara": 91
}

# Loop through keys and values
for name, score in grades.items():
    print(f"{name}: {score}")

# Check if key exists
if "Alice" in grades:
    print("Alice is in the list")
C++
map<string, int> grades;
grades["Alice"] = 95;
grades["Bob"]   = 87;
grades["Clara"] = 91;

// Loop through all pairs
for (auto pair : grades) {
    cout << pair.first << ": "
         << pair.second << endl;
}

// Check if key exists
if (grades.count("Alice"))
    cout << "Alice found" << endl;
SECTION 10.3
Useful dictionary methods (Python)
Python dictionaries have several methods that make them easy to work with.
Python
person = {"name": "Alice", "age": 22}

print(person.keys())     # dict_keys(['name', 'age'])
print(person.values())   # dict_values(['Alice', 22])
print(len(person))       # 2

# Safe access — returns None if key missing
print(person.get("email", "not found"))

# Delete a key
del person["age"]
print(person)
C++
map<string, string> person;
person["name"] = "Alice";
person["age"]  = "22";

cout << person.size() << endl; // 2

// Delete a key
person.erase("age");
cout << person.size() << endl; // 1
Try it yourself
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 10
Given d = {"x": 10, "y": 20}, how do you access the value 10?
A d[0]
B d["x"]
C d.get(0)
D d.x
Lesson 11 · Module 3

Logical Operators

Logical operators combine or invert boolean conditions. AND, OR, and NOT are the building blocks of all conditional logic.

SECTION 11.1
and, or, not
These three operators are the foundation of all logical thinking in programming. Python uses the words and, or, not. C++ uses symbols &&, ||, !.
OperatorPythonC++Meaning
ANDand&&Both conditions must be true
ORor||At least one must be true
NOTnot!Flips true to false, false to true
Python
age   = 20
score = 85

# AND — both must be true
if age >= 18 and score >= 80:
    print("Eligible and high score")

# OR — at least one true
if age < 13 or age > 65:
    print("Special discount")

# NOT — flip the result
logged_in = False
if not logged_in:
    print("Please log in first")
C++
int  age   = 20;
int  score = 85;
bool loggedIn = false;

// AND
if (age >= 18 && score >= 80)
    cout << "Eligible and high score" << endl;

// OR
if (age < 13 || age > 65)
    cout << "Special discount" << endl;

// NOT
if (!loggedIn)
    cout << "Please log in first" << endl;
SECTION 11.2
Truth table: how and/or work
ABA and BA or Bnot A
TrueTrueTrueTrueFalse
TrueFalseFalseTrueFalse
FalseTrueFalseTrueTrue
FalseFalseFalseFalseTrue
Short-circuit evaluation: With and, if the first condition is False, Python/C++ skip the second: it can't be true anyway. With or, if the first is True, they skip the second. This matters for performance and avoiding errors.
Try it yourself
Change age and score, see which conditions fire
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 11
What is the result of True and False?
A True
B False
C None
D Error
Lesson 12 · Module 3

Loop Control: break & continue

Sometimes you need to stop a loop early, or skip certain iterations. break exits a loop immediately. continue skips the current iteration. Both behave identically in Python and C++.

SECTION 12.1
break: stop the loop immediately
🛑
Think of it like a fire alarm. You're going through your task list, but when the alarm sounds: you stop everything immediately and leave. You don't finish the list.
Python
# Stop as soon as we find 5
for i in range(1, 10):
    if i == 5:
        print("Found 5, stopping!")
        break
    print(i)

# Output: 1 2 3 4 Found 5, stopping!
C++
// Stop as soon as we find 5
for (int i = 1; i < 10; i++) {
    if (i == 5) {
        cout << "Found 5, stopping!" << endl;
        break;
    }
    cout << i << endl;
}
// Output: 1 2 3 4 Found 5, stopping!
SECTION 12.2
continue: skip this iteration, keep going
Think of skipping a song. You don't stop listening to music entirely: you just skip this one track and move on to the next.
Python
# Print only odd numbers (skip evens)
for i in range(1, 11):
    if i % 2 == 0:   # if even
        continue
    print(i)

# Output: 1 3 5 7 9
C++
// Print only odd numbers
for (int i = 1; i <= 10; i++) {
    if (i % 2 == 0)
        continue;
    cout << i << endl;
}
// Output: 1 3 5 7 9
SECTION 12.3
Real-world example: search and stop
Python
names = ["Alice", "Bob", "Clara", "Dave"]
target = "Clara"

for name in names:
    if name == target:
        print(f"Found {name}!")
        break
    print(f"Checked {name}...")
C++
string names[] = {"Alice","Bob","Clara","Dave"};
string target = "Clara";

for (string name : names) {
    if (name == target) {
        cout << "Found " << name << "!" << endl;
        break;
    }
    cout << "Checked " << name << endl;
}
Try it yourself
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 12
What does continue do inside a loop?
A Exits the entire program
B Stops the loop completely
C Skips the current iteration and moves to the next
D Restarts the loop from the beginning
Lesson 13 · Module 3

Error Handling

Errors are inevitable. Error handling lets a program respond to them at runtime rather than crash.

SECTION 13.1
What is an error?
When something goes wrong at runtime: dividing by zero, converting "abc" to an integer, accessing a list index that doesn't exist: Python and C++ throw an exception. Without error handling, your program stops immediately with an error message.
Python: WITHOUT handling (crashes)
num = int("hello")   # CRASH!
# ValueError: invalid literal for int()

result = 10 / 0        # CRASH!
# ZeroDivisionError: division by zero
C++: WITHOUT handling (undefined)
int result = 10 / 0;     // Undefined behaviour
int x = stoi("hello"); // throws exception
SECTION 13.2
try / except: catching errors in Python
🪣
Think of a safety net. You try something risky. If it fails, the net catches you: and you decide what to do next instead of hitting the ground.
Python
# Basic try / except
try:
    num = int("hello")
    print(num)
except ValueError:
    print("That's not a valid number!")

# Catching division by zero
try:
    result = 10 / 0
except ZeroDivisionError:
    print("Cannot divide by zero.")

# Catch any error
try:
    x = int(input("Enter a number: "))
    print(100 / x)
except Exception as e:
    print(f"Error: {e}")
C++
#include <stdexcept>

// Basic try / catch
try {
    int num = stoi("hello"); // throws
    cout << num << endl;
} catch (invalid_argument& e) {
    cout << "Not a valid number!" << endl;
}

// Catch any exception
try {
    int x = 0;
    if (x == 0) throw runtime_error("div by zero");
    cout << 10/x << endl;
} catch (exception& e) {
    cout << "Error: " << e.what() << endl;
}
SECTION 13.3
try / except / finally: always run cleanup
The finally block runs no matter what: whether an error happened or not. Use it for cleanup tasks like closing files or showing a "done" message.
Python
try:
    num = int("42")
    print(f"Got: {num}")
except ValueError:
    print("Invalid number")
finally:
    print("This always runs")

# Output:
# Got: 42
# This always runs
C++
// C++ has no "finally" keyword
// but you can simulate it:
try {
    int num = stoi("42");
    cout << "Got: " << num << endl;
} catch (...) {
    cout << "Invalid number" << endl;
}
// Code after try/catch always runs
cout << "This always runs" << endl;
Try it yourself
The simulated input is "hello": watch it get caught
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 13
In Python, which block of code runs even if no error occurred?
A except
B try
C finally
D catch
Lesson 14 · Module 3

Variable Scope

Scope defines where a variable is accessible. A variable declared inside a function does not exist outside it.

SECTION 14.1
Local vs Global variables
🏠
Think of a house with rooms. A global variable is in the living room: everyone can see it. A local variable is in a bedroom: only the person in that room can use it. When you leave the room (function ends), the local variable is gone.
Python
name = "Alice"   # GLOBAL — visible everywhere

def greet():
    message = "Hello!"   # LOCAL — only inside greet()
    print(name)           # Can access global
    print(message)        # Can access local

greet()
print(name)     # Works — global
print(message)  # ERROR — message doesn't exist here
C++
string name = "Alice";  // GLOBAL

void greet() {
    string message = "Hello!";  // LOCAL
    cout << name << endl;     // Can access global
    cout << message << endl;  // Can access local
}

int main() {
    greet();
    cout << name << endl;  // Works
    // cout << message — ERROR
    return 0;
}
SECTION 14.2
Modifying a global variable inside a function
By default, a function in Python can read a global variable but not change it. To change it, you must declare it with the global keyword. In C++, global variables can always be read and modified.
Python
count = 0

def increment():
    global count       # must declare this
    count += 1

increment()
increment()
increment()
print(count)   # 3
C++
int count = 0;  // global

void increment() {
    count++;     // no keyword needed in C++
}

int main() {
    increment();
    increment();
    increment();
    cout << count << endl;  // 3
    return 0;
}
Best practice: Avoid global variables when you can. Pass values as parameters instead. Globals make code hard to understand: you can't tell what changes them without reading every function. Use them only when truly necessary.
Try it yourself
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 14
In Python, what keyword lets you modify a global variable from inside a function?
A extern
B global
C public
D shared
Lesson 15 · Module 4: Object-Oriented Programming

Classes & Objects

A class is a blueprint. An object is an instance of that blueprint. This is the foundation of object-oriented programming.

SECTION 15.1
What is a class?
🏗️
Think of a class like an architectural blueprint for a house. The blueprint defines what every house built from it will have: rooms, windows, a door. Each actual house built from that blueprint is an object. You can build many houses from one blueprint: all slightly different, but all following the same structure.
A class groups together data (called attributes) and functions that work on that data (called methods) into one unit. Instead of having separate variables for name, age, email: you bundle them into a Person class.
SECTION 15.2
Defining and using a class
Python
class Person:
    def __init__(self, name, age):
        self.name = name    # attribute
        self.age  = age

    def greet(self):       # method
        print(f"Hi, I'm {self.name}, {self.age}.")

    def birthday(self):
        self.age += 1
        print(f"Happy birthday {self.name}! Now {self.age}.")

# Create objects from the class
alice = Person("Alice", 22)
bob   = Person("Bob",   30)

alice.greet()
bob.greet()
alice.birthday()
print(alice.age)      # 23
C++
class Person {
public:
    string name;
    int    age;

    // Constructor
    Person(string n, int a) {
        name = n;
        age  = a;
    }

    void greet() {
        cout << "Hi, I'm " << name
             << ", " << age << "." << endl;
    }

    void birthday() {
        age++;
        cout << "Happy birthday " << name
             << "! Now " << age << endl;
    }
};

int main() {
    Person alice("Alice", 22);
    Person bob("Bob", 30);
    alice.greet();
    bob.greet();
    alice.birthday();
    return 0;
}
SECTION 15.3
Key vocabulary
TermMeaningExample
ClassThe blueprint / templateclass Person:
Object / InstanceA specific thing built from the classalice = Person("Alice", 22)
AttributeData stored in the objectself.name, self.age
MethodA function that belongs to the classdef greet(self):
ConstructorSpecial method that runs when object is created__init__ (Python) / ClassName() (C++)
self / thisRefers to the current objectself.name (Python), this->name (C++)
SECTION 15.4
A real example: Bank Account
Python
class BankAccount:
    def __init__(self, owner, balance=0):
        self.owner   = owner
        self.balance = balance

    def deposit(self, amount):
        self.balance += amount
        print(f"Deposited {amount}. Balance: {self.balance}")

    def withdraw(self, amount):
        if amount > self.balance:
            print("Insufficient funds.")
        else:
            self.balance -= amount
            print(f"Withdrew {amount}. Balance: {self.balance}")

acc = BankAccount("Alice", 100)
acc.deposit(50)
acc.withdraw(30)
acc.withdraw(200)
C++
class BankAccount {
public:
    string owner;
    double balance;

    BankAccount(string o, double b) {
        owner = o; balance = b;
    }

    void deposit(double amt) {
        balance += amt;
        cout << "Balance: " << balance << endl;
    }

    void withdraw(double amt) {
        if (amt > balance)
            cout << "Insufficient funds." << endl;
        else { balance -= amt;
            cout << "Balance: " << balance << endl; }
    }
};
Try it yourself
Create your own Person object and call its methods
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 15
In Python, what is the name of the special method that runs automatically when you create a new object?
A start()
B create()
C __init__
D constructor()
Lesson 16 · Module 4

Math & Libraries

Libraries are collections of pre-written code you import and use. The standard math library covers roots, rounding, and trigonometry.

SECTION 16.1
The math library
Both Python and C++ have a built-in math library. In Python you import it with import math. In C++ you include it with #include <cmath>.
Python
import math

print(math.sqrt(16))      # 4.0
print(math.sqrt(2))       # 1.4142...
print(math.floor(3.7))   # 3  (round down)
print(math.ceil(3.2))    # 4  (round up)
print(round(3.567, 2))   # 3.57
print(abs(-15))           # 15
print(math.pow(2, 8))    # 256.0
print(math.pi)            # 3.14159...
C++
#include <cmath>

cout << sqrt(16)       << endl; // 4
cout << sqrt(2)        << endl; // 1.4142
cout << floor(3.7)    << endl; // 3
cout << ceil(3.2)     << endl; // 4
cout << round(3.567) << endl; // 4
cout << abs(-15)      << endl; // 15
cout << pow(2, 8)     << endl; // 256
cout << M_PI           << endl; // 3.14159
SECTION 16.2
Random numbers
Generating random numbers is essential for games, simulations, and testing. Both languages have a random library.
Python
import random

# Random integer between 1 and 10
n = random.randint(1, 10)
print(n)

# Random float between 0 and 1
f = random.random()
print(f)

# Pick a random item from a list
colours = ["red", "blue", "green"]
print(random.choice(colours))

# Shuffle a list
random.shuffle(colours)
print(colours)
C++
#include <cstdlib>
#include <ctime>

// Seed the random generator
srand(time(0));

// Random int 1–10
int n = rand() % 10 + 1;
cout << n << endl;

// Random float 0–1
double f = (double)rand() / RAND_MAX;
cout << f << endl;
SECTION 16.3
What is a library / module?
A library is someone else's code that you can use in your program. Instead of writing a square root function yourself, you import Python's math module and use math.sqrt(). There are libraries for web requests, databases, plotting, machine learning, image processing. Using them is a core part of practical programming.
Python LibraryWhat it does
mathSquare roots, trigonometry, logarithms
randomRandom numbers, shuffling, picking
datetimeDates, times, time differences
osFile system operations
jsonRead and write JSON data
requestsMake HTTP requests to the internet
numpyFast maths on large arrays (scientific)
pandasWork with tables of data
Try it yourself
Python
$ output will appear here
C++
$ output will appear here
Quick Check · Lesson 16
In Python, how do you calculate the square root of 25?
A sqrt(25)
B math.sqrt(25): after import math
C 25 ** 0.5 only
D Math.sqrt(25)
Engineer Lesson 01 · Module 5

Python for Mechanical & Aerospace Engineers

Python covers the same ground as MATLAB and Excel: with better automation and no licence cost. This module covers the specific tools aerospace and mechanical engineers use in practice.

SECTION E1.1
Where engineers waste the most time: and how Python fixes it
Typical engineering taskExcel / MATLABPython
Process 10,000 rows of sensor dataExcel slows down, crashesLoads in under a second with pandas
Plot a stress-strain curve with annotationsExcel chart: limited controlFull control with matplotlib in 15 lines
Fit a regression line to test dataExcel trendline: no equations outputnumpy.polyfit gives slope, intercept, R²
Solve a system of equations (stiffness matrix)MATLAB: costs thousands per licencenumpy.linalg.solve: free, same syntax
Run the same analysis on 50 test filesOpen each one manuallyLoop through all files in 10 lines
Share your analysis methodSend the file and hope formatting survivesSend the script: identical result every time
SECTION E1.2
The four libraries you need
LibraryWhat it doesEngineering use case
numpyFast maths on arrays: vectors, matrices, trig, linear algebraStress calculations, coordinate transforms, solving Ax = b, FEA pre/post
pandasLoad, inspect, filter, and transform tables of dataSensor logs, fatigue test results, wind tunnel data, material databases
matplotlibPlot any graph: line, scatter, bar, contour, polarStress-strain curves, lift polars, trajectory plots, temperature maps
scipyScientific algorithms: signal processing, curve fitting, integration, ODE solvingVibration analysis, aerodynamic data fitting, flight path integration
Install everything at once: Open a terminal and run pip install numpy pandas matplotlib scipy openpyxl. That's your complete engineering toolkit: the same capabilities as a MATLAB installation, without the licence.
SECTION E1.3
What a real engineering script looks like
Here's a complete example: loading tensile test data, computing yield strength, plotting the curve with annotations. You'll build every part of this in the next five lessons.
tensile_analysis.py: a complete engineering workflow
import numpy  as np
import pandas as pd
import matplotlib.pyplot as plt

# 1. Load test data from the testing machine CSV export
df = pd.read_csv("al6061_tensile.csv")           # columns: strain_pct, stress_MPa

# 2. Find the 0.2% proof stress (yield strength)
offset_line = 0.2 + df["strain_pct"] * 0         # horizontal offset
elastic     = 68900 * (df["strain_pct"] - 0.2) / 100   # E = 68.9 GPa

# 3. Compute basic stats
uts = df["stress_MPa"].max()
print(f"UTS:              {uts:.1f} MPa")
print(f"Fracture strain:  {df['strain_pct'].iloc[-1]:.2f}%")

# 4. Plot the stress-strain curve
plt.figure(figsize=(9, 6))
plt.plot(df["strain_pct"], df["stress_MPa"],
         color="steelblue", linewidth=2, label="Al 6061-T6")
plt.axhline(uts, color="red", linestyle="--", label=f"UTS = {uts:.0f} MPa")
plt.xlabel("Strain (%)");  plt.ylabel("Stress (MPa)")
plt.title("Tensile Test — Al 6061-T6")
plt.legend();  plt.grid(True, alpha=0.3)
plt.savefig("stress_strain.png", dpi=300)
plt.show()
Engineer Lesson 02 · Module 5

NumPy: Engineering Maths at Scale

NumPy is the numerical foundation of scientific Python. It operates on entire arrays at once: the same vectorised approach used in MATLAB, without the licence.

SECTION E2.1
Arrays: the foundation
A NumPy array is like a Python list, but optimised for numbers. The key difference: you can apply maths to the whole array at once. No loop needed.
arrays.py
import numpy as np

# Create arrays from data
stress  = np.array([0, 50, 100, 150, 200, 250])   # MPa
strain  = np.array([0, 0.07, 0.14, 0.22, 0.29, 0.36]) # %

# Maths on every element at once — no loop
stress_psi = stress * 145.038    # convert MPa → psi (all 6 values at once)
strain_dec = strain / 100         # percent → decimal

# Generate engineering sequences
angles    = np.linspace(0, 360, 361)   # 0° to 360°, every 1°
time      = np.linspace(0, 10, 1000)   # 0 to 10 seconds, 1000 points
thicknesses = np.arange(1, 20, 0.5)    # 1mm to 19.5mm, step 0.5mm

# Array properties
print(stress.shape)    # (6,) — 1D array with 6 elements
print(stress.size)     # 6
print(stress.dtype)    # int64
SECTION E2.2
Trigonometry and vectors: used constantly in aerospace
trig_vectors.py
import numpy as np

# Trig — angles in radians (convert first)
aoa_deg = np.array([0, 2, 4, 6, 8, 10, 12])    # angle of attack, degrees
aoa_rad = np.radians(aoa_deg)

# Thin aerofoil theory: CL ≈ 2π·sin(α)  (linearised)
CL = 2 * np.pi * np.sin(aoa_rad)
print("CL values:", np.round(CL, 3))

# Decompose a velocity vector into components
V_total = 250       # m/s (TAS)
gamma   = 15        # flight path angle, degrees
Vx = V_total * np.cos(np.radians(gamma))   # horizontal component
Vz = V_total * np.sin(np.radians(gamma))   # vertical component (climb)
print(f"Vx = {Vx:.1f} m/s,  Vz = {Vz:.1f} m/s")

# 2D rotation matrix — rotate a force vector by θ
theta = np.radians(30)
R = np.array([[np.cos(theta), -np.sin(theta)],
              [np.sin(theta),  np.cos(theta)]])
F = np.array([1000, 0])       # 1000 N in x direction
F_rotated = R @ F               # @ is matrix multiply
print(f"Rotated force: {F_rotated.round(1)} N")
SECTION E2.3
Solving systems of equations: replacing MATLAB's backslash
Stiffness matrices, force equilibrium, circuit analysis: all reduce to Ax = b. NumPy solves this in one line, identical to MATLAB's x = A\b.
linear_systems.py
import numpy as np

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Example: 3-bar truss — solve for nodal displacements
# K·u = F  →  stiffness matrix × displacements = forces
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

K = np.array([[ 3, -1,  0],
              [-1,  2, -1],
              [ 0, -1,  1]], dtype=float) # stiffness matrix

F = np.array([0, 10000, 0])               # applied forces, N

# Solve — equivalent to MATLAB's u = K\F
u = np.linalg.solve(K, F)
print("Displacements (mm):", u.round(4))

# Matrix properties engineers need
print(f"Determinant:   {np.linalg.det(K):.2f}")
print(f"Condition no.: {np.linalg.cond(K):.2f}")  # ill-conditioned if large

eigenvalues, eigenvectors = np.linalg.eig(K)
print("Natural frequencies (proportional):", np.sqrt(eigenvalues).round(3))
SECTION E2.4
ISA atmosphere model: a practical example
The International Standard Atmosphere defines how air density, pressure, and temperature vary with altitude. Here's how to compute it for an entire altitude range at once using NumPy.
isa_atmosphere.py
import numpy as np

# ISA constants
T0   = 288.15   # sea level temperature, K
P0   = 101325   # sea level pressure, Pa
rho0 = 1.225    # sea level density, kg/m³
L    = 0.0065   # lapse rate, K/m
g    = 9.80665  # gravitational acceleration
R    = 287.05   # specific gas constant, J/(kg·K)

# Altitude range: sea level to 11 km (troposphere), 100 points
h = np.linspace(0, 11000, 100)     # metres

# ISA equations — applied to entire array at once
T   = T0 - L * h                         # temperature, K
P   = P0 * (T / T0) ** (g / (L * R))    # pressure, Pa
rho = P / (R * T)                         # density, kg/m³
a   = np.sqrt(1.4 * R * T)               # speed of sound, m/s

# Print at specific altitudes
for alt in [0, 3000, 6000, 9000, 11000]:
    idx = np.argmin(np.abs(h - alt))     # find nearest index
    print(f"h={alt:5d}m  T={T[idx]:.1f}K  rho={rho[idx]:.4f}kg/m³  a={a[idx]:.1f}m/s")
Quick Check · Engineer Lesson 2
You have stress values in MPa: s = np.array([100, 200, 300]). How do you convert them all to psi (1 MPa = 145.038 psi) in one line?
A for x in s: x * 145.038
B s * 145.038
C s.convert(145.038)
D np.multiply_each(s, 145.038)
Engineer Lesson 03 · Module 5

Statistics for Engineers

Test data always has scatter. This lesson covers the statistical tools for characterising distributions, fitting regression lines, and quantifying uncertainty.

SECTION E3.1
Descriptive statistics: characterising your data
descriptive_stats.py: tensile test results from 20 specimens
import numpy as np

# UTS results from 20 tensile specimens of Ti-6Al-4V (MPa)
uts = np.array([940, 952, 938, 961, 945, 958, 933, 970,
                948, 955, 942, 964, 950, 937, 956, 947,
                962, 943, 959, 951])

mean   = np.mean(uts)
std    = np.std(uts, ddof=1)        # ddof=1 for sample std dev
cv     = (std / mean) * 100         # coefficient of variation, %
se     = std / np.sqrt(len(uts))    # standard error

print(f"Mean UTS:           {mean:.1f} MPa")
print(f"Std deviation:      {std:.1f} MPa")
print(f"Coefficient of var: {cv:.2f}%")
print(f"Min / Max:          {uts.min()} / {uts.max()} MPa")
print(f"95% CI (approx):    {mean:.1f} ± {2*se:.1f} MPa")

# Spot outliers: values beyond 2 standard deviations
outliers = uts[np.abs(uts - mean) > 2 * std]
print(f"Outliers:           {outliers}")
ddof=1 vs ddof=0: Use ddof=1 for sample standard deviation (when your data is a sample from a larger population: which it almost always is in testing). Use ddof=0 only when you have the entire population. Getting this wrong is a common mistake in engineering reports.
SECTION E3.2
Linear regression: fitting a line to test data
You have test points scattered on a graph and you want the best-fit line: the trend. This is linear regression. NumPy does it with polyfit, which also gives you R² to quantify how good the fit is.
linear_regression.py: fatigue S-N curve data
import numpy as np
import matplotlib.pyplot as plt

# Fatigue data: stress amplitude (MPa) vs log10(cycles to failure)
stress   = np.array([400, 350, 300, 280, 260, 240, 220, 200])
log_N    = np.array([3.8, 4.2, 4.9, 5.3, 5.7, 6.1, 6.6, 7.0])

# Linear regression: fit a straight line y = m*x + c
coeffs  = np.polyfit(log_N, stress, 1)   # degree 1 = linear
m, c    = coeffs
print(f"Slope (m):     {m:.2f}")
print(f"Intercept (c): {c:.2f}")
print(f"Equation:      stress = {m:.1f} * log10(N) + {c:.1f}")

# Calculate R² (coefficient of determination)
predicted  = np.polyval(coeffs, log_N)
ss_res     = np.sum((stress - predicted) ** 2)
ss_tot     = np.sum((stress - stress.mean()) ** 2)
r_squared  = 1 - (ss_res / ss_tot)
print(f"R² = {r_squared:.4f}")           # closer to 1.0 = better fit

# Predict: what stress gives 10^6 cycles?
stress_at_1M = np.polyval(coeffs, 6.0)
print(f"Fatigue limit (10^6 cycles): {stress_at_1M:.1f} MPa")

# Plot
fit_line = np.linspace(3.5, 7.5, 100)
plt.scatter(log_N, stress, color="steelblue", s=60, label="Test data", zorder=5)
plt.plot(fit_line, np.polyval(coeffs, fit_line), "r--",
         label=f"Linear fit  R²={r_squared:.3f}")
plt.xlabel("log₁₀(N)");  plt.ylabel("Stress amplitude (MPa)")
plt.title("S-N Curve — Al 2024-T3")
plt.legend();  plt.grid(True, alpha=0.3)
plt.show()
SECTION E3.3
Polynomial curve fitting: nonlinear data
Not all engineering relationships are linear. Drag polars, material nonlinearity, calibration curves: use higher-degree polynomials. polyfit handles these too.
poly_fit.py: lift polar fit
import numpy as np
import matplotlib.pyplot as plt

# Experimental lift coefficient data (clean wing, subsonic)
alpha = np.array([-4, -2,  0,  2,  4,  6,  8, 10, 12])  # AoA, deg
CL    = np.array([-0.28, -0.06, 0.18, 0.42, 0.65, 0.88,
                    1.05, 1.18, 1.22])

# Fit a 2nd-degree polynomial (captures the slight nonlinearity)
coeffs  = np.polyfit(alpha, CL, 2)
print(f"CL = {coeffs[0]:.5f}·α² + {coeffs[1]:.4f}·α + {coeffs[2]:.4f}")

# Lift slope dCL/dα at α=0° (per degree)
lift_slope = 2 * coeffs[0] * 0 + coeffs[1]
print(f"Lift slope: {lift_slope:.4f} /deg  ({lift_slope * 57.3:.3f} /rad)")

# Zero-lift angle
zero_lift_alpha = np.roots(coeffs)[1]  # find root of polynomial
print(f"Zero-lift angle: {zero_lift_alpha:.2f}°")

# Plot the fit against the data
alpha_smooth = np.linspace(-5, 14, 200)
CL_fit = np.polyval(coeffs, alpha_smooth)
plt.scatter(alpha, CL, color="steelblue", s=60, zorder=5, label="Wind tunnel data")
plt.plot(alpha_smooth, CL_fit, "r-", linewidth=2, label="Quadratic fit")
plt.axhline(0, color="gray", lw=0.5)
plt.axvline(0, color="gray", lw=0.5)
plt.xlabel("Angle of Attack α (°)")
plt.ylabel("Lift Coefficient CL")
plt.title("Lift Polar — Clean Wing")
plt.legend();  plt.grid(True, alpha=0.3)
plt.show()
Quick Check · Engineer Lesson 3
When computing sample standard deviation in NumPy from test specimens, which parameter should you use?
A np.std(data): default is always correct
B np.std(data, ddof=1): ddof=1 for a sample
C np.std(data, ddof=0): always use 0
D np.sample_std(data)
Engineer Lesson 04 · Module 5

Pandas: Analysing Real Test Data

Pandas provides a DataFrame: a table structure that reads directly from CSV and Excel exports. Filter, transform, aggregate, and export: all in code that runs identically every time.

SECTION E4.1
Loading and inspecting engineering data
load_inspect.py: structural test dataset
import pandas as pd

# Load CSV exported from test rig or DAS (data acquisition system)
df = pd.read_csv("wing_fatigue_tests.csv")

# Or load from Excel directly
df = pd.read_excel("material_test_results.xlsx", sheet_name="Fatigue")

# First look — always do this before any analysis
print(df.head(5))         # first 5 rows
print(df.shape)           # (rows, columns)
print(df.columns.tolist()) # column names
print(df.dtypes)          # data types — check nothing read as text
print(df.isnull().sum())  # missing values per column
print(df.describe())      # count, mean, std, min, 25%, 50%, 75%, max
SECTION E4.2
Filtering and selecting data
filtering.py
# Select a single column
cycles = df["cycles_to_failure"]

# Filter rows: only specimens that failed (not run-outs)
failed  = df[df["status"] == "FAILURE"]
runouts = df[df["status"] == "RUNOUT"]
print(f"Failures: {len(failed)},  Run-outs: {len(runouts)}")

# Filter: specific material AND stress range
al_high = df[
    (df["material"] == "Al2024-T3") &
    (df["stress_MPa"] >= 200) &
    (df["stress_MPa"] <= 350)
]

# Filter to several materials at once
metals = df[df["material"].isin(["Al2024-T3", "Ti6Al4V", "Steel4340"])]

# Sort by cycles descending
df_sorted = df.sort_values("cycles_to_failure", ascending=False)
SECTION E4.3
Computing derived quantities
derived_columns.py: compute engineering quantities from raw data
# Compute stress from raw load and cross-section area
df["stress_MPa"]   = df["load_N"]  / df["area_mm2"]

# Compute engineering strain from displacement and gauge length
df["strain_pct"]   = (df["disp_mm"] / df["gauge_mm"]) * 100

# Flag specimens that exceeded design limit
df["overload"]     = df["stress_MPa"] > 250

# Mean stress and stress ratio for fatigue
df["stress_mean"]  = (df["stress_max"] + df["stress_min"]) / 2
df["stress_ratio"] = df["stress_min"] / df["stress_max"]

# Group by material — get mean, std, count per group (like a pivot table)
summary = df.groupby("material").agg(
    mean_UTS   = ("stress_MPa", "mean"),
    std_UTS    = ("stress_MPa", "std"),
    n_specimens= ("stress_MPa", "count")
).round(2)
print(summary)
SECTION E4.4
Full end-to-end analysis workflow
full_analysis.py: from raw data to report-ready numbers
import pandas as pd
import numpy  as np

# 1. Load raw test data
df = pd.read_csv("fatigue_database.csv")
print(f"Loaded {len(df)} records.")

# 2. Clean — remove rows with missing critical values
df = df.dropna(subset=["stress_MPa", "cycles_to_failure"])

# 3. Compute log cycles (for S-N plot)
df["log_N"] = np.log10(df["cycles_to_failure"])

# 4. Filter to material of interest
mat = df[df["material"] == "Al2024-T3"]

# 5. Linear regression on S-N data
coeffs = np.polyfit(mat["log_N"], mat["stress_MPa"], 1)
print(f"Fatigue limit (10^7): {np.polyval(coeffs, 7):.1f} MPa")

# 6. Export processed results
mat.to_csv("al2024_processed.csv", index=False)
print("Saved al2024_processed.csv")
Quick Check · Engineer Lesson 4
You want the average stress for each material type in your DataFrame. Which line does this?
A df.average("stress_MPa", group="material")
B df.groupby("material")["stress_MPa"].mean()
C df["material"].mean("stress_MPa")
D df.pivot("material", "stress_MPa")
Engineer Lesson 05 · Module 5

Matplotlib: Engineering Plots

Matplotlib produces publication-quality plots with full control over every element: axes, labels, annotations, line styles. No menu-clicking required.

SECTION E5.1
Stress-strain curve: the engineer's signature plot
stress_strain_plot.py
import matplotlib.pyplot as plt
import numpy as np

# Simulated tensile test data — Al 6061-T6
strain = np.array([0,    0.1,  0.2,  0.3,  0.45, 0.7,
                    1.0,  1.5,  2.0,  3.0,  4.5,  6.0,  8.0])
stress = np.array([0,    69,  138,  207,  270, 290,
                    300,  310,  318,  325,  310,  285,  240])

uts_idx   = np.argmax(stress)
uts_val   = stress[uts_idx]
uts_strain= strain[uts_idx]

fig, ax = plt.subplots(figsize=(9, 6))

# Main curve
ax.plot(strain, stress, color="#2563EB", linewidth=2.5, label="Al 6061-T6")

# Elastic modulus line (first two points)
ax.plot([0, 0.3], [0, 68.9*0.003*1000], "k--", lw=1, label="Elastic region")

# Annotate UTS
ax.annotate(f"UTS = {uts_val} MPa",
            xy=(uts_strain, uts_val),
            xytext=(uts_strain - 1.5, uts_val - 25),
            arrowprops=dict(arrowstyle="->", color="red"),
            color="red", fontsize=10)

ax.set_xlabel("Engineering Strain (%)", fontsize=12)
ax.set_ylabel("Engineering Stress (MPa)", fontsize=12)
ax.set_title("Tensile Test — Al 6061-T6", fontsize=14)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig("stress_strain.png", dpi=300)
plt.show()
SECTION E5.2
Multiple subplots: flight data summary
flight_data_plot.py: four parameters on one figure
import matplotlib.pyplot as plt
import numpy as np

t    = np.linspace(0, 120, 1200)          # 0–120 s at 10 Hz
alt  = 1000 * (t/120)**0.5                # altitude climb
spd  = 60 + 80 * (t/120)                 # speed ramp
aoa  = 8 * np.exp(-t/30) + 2              # AoA reducing after rotation
nz   = 1 + 0.3 * np.sin(2*np.pi*t/15)    # normal load factor

fig, axes = plt.subplots(4, 1, figsize=(11, 10), sharex=True)
fig.suptitle("Flight Test Data — Climb Segment", fontsize=14, y=0.98)

data   = [alt, spd, aoa, nz]
labels = ["Altitude (m)", "Airspeed (m/s)", "AoA (°)", "Load Factor Nz"]
colors = ["#2563EB", "#16A34A", "#DC2626", "#9333EA"]

for ax, d, lbl, col in zip(axes, data, labels, colors):
    ax.plot(t, d, color=col, linewidth=1.5)
    ax.set_ylabel(lbl, fontsize=10)
    ax.grid(True, alpha=0.25)

axes[3].axhline(2.5, color="red", ls="--", lw=1, label="Limit load")
axes[3].legend(fontsize=9)
axes[3].set_xlabel("Time (s)", fontsize=11)

plt.tight_layout()
plt.savefig("flight_data.png", dpi=300)
plt.show()
SECTION E5.3
Scatter plot with colour mapping: visualising 3 variables
scatter_colormap.py: stress vs cycles, coloured by temperature
import matplotlib.pyplot as plt
import numpy as np

# Simulated dataset: fatigue tests at different temperatures
np.random.seed(42)
n       = 80
stress  = np.random.uniform(150, 400, n)
log_N   = 8.5 - 0.012 * stress + np.random.normal(0, 0.2, n)
temp    = np.random.uniform(20, 300, n)    # temperature, °C

fig, ax = plt.subplots(figsize=(9, 6))

sc = ax.scatter(log_N, stress, c=temp, cmap="plasma", s=50, alpha=0.8)
cbar = plt.colorbar(sc, ax=ax)
cbar.set_label("Temperature (°C)", fontsize=11)

ax.set_xlabel("log₁₀(N) — Cycles to Failure", fontsize=12)
ax.set_ylabel("Stress Amplitude (MPa)",       fontsize=12)
ax.set_title("S-N Data Coloured by Test Temperature", fontsize=13)
ax.grid(True, alpha=0.25)
plt.tight_layout()
plt.savefig("sn_temperature.png", dpi=300)
plt.show()
Quick Check · Engineer Lesson 5
To save a plot at high resolution suitable for a report or publication, which line is correct?
A plt.export("plot.png", quality=high)
B plt.savefig("plot.png", dpi=300)
C plt.show("plot.png")
D plt.save_high_res("plot.png")
Engineer Lesson 06 · Module 5

Automating Excel Reports

Excel remains the standard for sharing results in most engineering organisations. Python automates the process: read the data, run the analysis, write the report. The same script handles every new dataset.

SECTION E6.1
Reading Excel files
read_excel.py
import pandas as pd

# Read first sheet (default)
df = pd.read_excel("test_results.xlsx")

# Read a named sheet
df = pd.read_excel("test_results.xlsx", sheet_name="Fatigue")

# Read all sheets at once (returns a dict)
all_sheets = pd.read_excel("test_results.xlsx", sheet_name=None)
for name, sheet_df in all_sheets.items():
    print(f"Sheet '{name}': {len(sheet_df)} rows")

# Skip metadata rows at the top (common in lab exports)
df = pd.read_excel("test_results.xlsx", skiprows=4, header=0)

# Read only specific columns to save memory
df = pd.read_excel("test_results.xlsx",
                   usecols=["specimen_id", "material", "UTS_MPa", "cycles"])
SECTION E6.2
Writing formatted Excel reports
write_report.py: multi-sheet report with summary
import pandas as pd
import numpy  as np

# Your analysed data
raw_df = pd.read_csv("fatigue_tests.csv")
raw_df["log_N"] = np.log10(raw_df["cycles_to_failure"])

# Summary statistics per material
summary = raw_df.groupby("material").agg(
    n_specimens = ("UTS_MPa", "count"),
    mean_UTS    = ("UTS_MPa", "mean"),
    std_UTS     = ("UTS_MPa", "std"),
    min_cycles  = ("cycles_to_failure", "min"),
    max_cycles  = ("cycles_to_failure", "max")
).round(2)

# Write to Excel with multiple sheets
with pd.ExcelWriter("fatigue_report.xlsx", engine="openpyxl") as writer:
    raw_df.to_excel(writer, sheet_name="Raw Data",   index=False)
    summary.to_excel(writer, sheet_name="Summary",    index=True)

print("Saved fatigue_report.xlsx with 2 sheets.")
SECTION E6.3
Batch processing: the real power
This is where Python truly earns its place in an engineering workflow. One script processes every test file in a folder: something that would take hours manually.
batch_process.py: process every test file in a folder
import pandas as pd
import numpy  as np
import glob
import os

# Find every .xlsx file in the test_data folder
files = glob.glob("test_data/*.xlsx")
print(f"Found {len(files)} test files.")

results = []

for filepath in files:
    specimen_id = os.path.basename(filepath).replace(".xlsx", "")

    # Load each file
    df = pd.read_excel(filepath)

    # Compute stress from raw columns
    df["stress_MPa"] = df["load_N"] / df["area_mm2"]

    # Extract summary for this specimen
    row = {
        "specimen":    specimen_id,
        "material":    df["material"].iloc[0],
        "UTS_MPa":     df["stress_MPa"].max().round(1),
        "E_GPa":       (df["stress_MPa"].iloc[5] / df["strain_pct"].iloc[5] * 100).round(1),
        "elong_pct":   df["strain_pct"].iloc[-1]
    }
    results.append(row)
    print(f"  Processed {specimen_id}  →  UTS = {row['UTS_MPa']} MPa")

# Compile all results
report = pd.DataFrame(results)
print("\n=== BATCH SUMMARY ===")
print(report)

# Append overall stats to the bottom
report.to_excel("batch_summary.xlsx", index=False)
print("\nSaved batch_summary.xlsx")
What just happened: That script processed an entire folder of test files: extracted UTS, Young's modulus, and elongation from each one, compiled everything into one table, and saved a report. In Excel: open each file, copy the peak value, paste it somewhere, repeat 50 times. In Python: run once, done in seconds, zero errors.
SECTION E6.4
Python vs Excel: full command reference for engineers
Engineering taskExcelPython
Open a fileDouble-clickpd.read_csv() / pd.read_excel()
Summary stats (all columns)=AVERAGE, =STDEV, etc.df.describe()
Filter rows by conditionData › AutoFilterdf[df["col"] > value]
Add computed columnWrite formula, drag downdf["new"] = df["a"] / df["b"]
Group statistics (pivot table)Insert › PivotTabledf.groupby("mat")["val"].mean()
Sort dataData › Sortdf.sort_values("col")
Remove duplicatesData › Remove Duplicatesdf.drop_duplicates()
Count missing valuesManually scandf.isnull().sum()
Linear regressionAdd trendline to chartnp.polyfit(x, y, 1)
Solve Ax = bMATLAB / manualnp.linalg.solve(A, b)
Plot stress-strain curveInsert Chart (8 clicks)plt.plot(strain, stress)
Save plot for reportRight-click › Save as imageplt.savefig("plot.png", dpi=300)
Process 50 test filesOpen each one manuallyfor f in glob("*.xlsx"): ...
Where to go next: Once you're comfortable with these five lessons, explore: scipy.signal for vibration and frequency analysis (FFT), scipy.integrate for numerical integration of flight paths and ODE systems, scipy.optimize for minimisation and curve fitting beyond polyfit, and Jupyter notebooks for running analysis interactively: the standard tool in engineering research environments.
Quick Check · Engineer Lesson 6
You need to process 40 Excel files in a folder automatically. Which Python tool finds all the files?
A os.list_excel("folder/")
B glob.glob("folder/*.xlsx")
C pd.find_files("folder/")
D excel.scan("folder/")
Reference 01 · Advanced Engineering

SciPy: The Engineering Toolkit

SciPy provides the scientific algorithms that sit on top of NumPy: FFT, curve fitting, ODE solvers, and signal filtering. The tools that make Python genuinely useful for engineering work.

SECTION R1.1
FFT: frequency analysis of vibration data
The Fast Fourier Transform converts a time-domain signal into its frequency components. If your sensor is vibrating and you want to find the dominant frequency: this is how.
fft_analysis.py: find resonant frequencies in vibration data
import numpy as np
from scipy import fft
import matplotlib.pyplot as plt

# Simulate accelerometer data: 50 Hz + 120 Hz signal + noise
fs  = 1000                              # sample rate, Hz
t   = np.linspace(0, 1, fs, endpoint=False) # 1 second
sig = (np.sin(2*np.pi*50*t)          # 50 Hz component
     + 0.5*np.sin(2*np.pi*120*t)      # 120 Hz component
     + 0.3*np.random.randn(fs))        # noise

# Compute FFT
N      = len(sig)
yf     = fft.fft(sig)
xf     = fft.fftfreq(N, 1/fs)          # frequency axis
power  = 2/N * np.abs(yf[:N//2])       # one-sided power spectrum
freqs  = xf[:N//2]

# Find dominant frequency
peak_freq = freqs[np.argmax(power)]
print(f"Dominant frequency: {peak_freq:.1f} Hz")

# Plot spectrum
plt.figure(figsize=(10, 4))
plt.plot(freqs, power, color="steelblue", lw=1.2)
plt.xlabel("Frequency (Hz)");  plt.ylabel("Amplitude")
plt.title("FFT Power Spectrum");  plt.grid(True, alpha=0.3)
plt.xlim(0, 300)
plt.show()
SECTION R1.2
curve_fit: fit any equation to your data
scipy.optimize.curve_fit is more powerful than np.polyfit: you define the equation shape yourself. Exponential decay, power law, sine wave, anything. It returns the best-fit parameters and their uncertainties.
curve_fit.py: fit creep data to a power law
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

# Define the equation to fit — power law: y = a * x^b
def power_law(x, a, b):
    return a * x ** b

# Creep strain data: time (hours) vs strain (%)
time   = np.array([1, 5, 10, 50, 100, 500, 1000])
strain = np.array([0.12, 0.21, 0.28, 0.48, 0.61, 1.05, 1.32])

# Fit — popt = best parameters, pcov = covariance (uncertainty)
popt, pcov = curve_fit(power_law, time, strain)
a, b = popt
perr = np.sqrt(np.diag(pcov))  # 1-sigma uncertainty

print(f"a = {a:.4f} ± {perr[0]:.4f}")
print(f"b = {b:.4f} ± {perr[1]:.4f}")
print(f"Equation: strain = {a:.4f} * t^{b:.4f}")

# Predict strain at 2000 hours
print(f"Predicted strain at 2000h: {power_law(2000, *popt):.3f}%")

# Plot
t_smooth = np.logspace(0, 4, 200)
plt.scatter(time, strain, s=60, zorder=5, label="Data")
plt.plot(t_smooth, power_law(t_smooth, *popt), "r-", label=f"Fit: {a:.3f}·t^{b:.3f}")
plt.xscale("log");  plt.yscale("log")
plt.xlabel("Time (h)");  plt.ylabel("Strain (%)")
plt.title("Creep Curve — Power Law Fit")
plt.legend();  plt.grid(True, alpha=0.3)
plt.show()
SECTION R1.3
solve_ivp: integrate ODEs (flight dynamics, thermal)
Ordinary differential equations describe how things change over time: aircraft equations of motion, heat transfer, rocket trajectories. scipy.integrate.solve_ivp is the modern way to solve them numerically.
rocket_trajectory.py: vertical rocket ascent (simplified)
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt

# Constants
g0    = 9.80665      # m/s²
m0    = 10000        # initial mass, kg
mdot  = 50           # mass flow rate, kg/s
Isp   = 300          # specific impulse, s
ve    = Isp * g0     # exhaust velocity, m/s
Cd    = 0.3          # drag coefficient
A     = 1.0          # reference area, m²
rho0  = 1.225        # sea level density, kg/m³

def rocket_eom(t, y):
    """State vector: y = [altitude, velocity, mass]"""
    h, v, m = y
    rho  = rho0 * np.exp(-h / 8500)       # exponential atmosphere
    drag = 0.5 * rho * v**2 * Cd * A
    thrust = mdot * ve if m > m0 - mdot*60 else 0   # burn for 60 s
    dhdt = v
    dvdt = (thrust - drag) / m - g0
    dmdt = -mdot if thrust > 0 else 0
    return [dhdt, dvdt, dmdt]

# Solve from t=0 to t=120s
sol = solve_ivp(rocket_eom, [0, 120], [0, 0, m0],
                max_step=0.5, dense_output=True)

t   = sol.t
h   = sol.y[0] / 1000    # altitude, km
v   = sol.y[1]             # velocity, m/s
print(f"Max altitude: {h.max():.2f} km")
print(f"Max velocity: {v.max():.1f} m/s")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 4))
ax1.plot(t, h, color="#2563EB");  ax1.set_ylabel("Altitude (km)")
ax2.plot(t, v, color="#DC2626");  ax2.set_ylabel("Velocity (m/s)")
for ax in (ax1, ax2):
    ax.set_xlabel("Time (s)");  ax.grid(True, alpha=0.3)
plt.suptitle("Rocket Trajectory");  plt.tight_layout()
plt.show()
SECTION R1.4
Signal filtering: remove noise from sensor data
filtering.py: low-pass filter a noisy pressure signal
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt

fs     = 500                          # sample rate, Hz
t      = np.linspace(0, 2, 2*fs)     # 2 seconds
clean  = np.sin(2*np.pi*5*t)          # 5 Hz true signal
noisy  = clean + 0.5*np.random.randn(len(t))

# Design a 4th-order Butterworth low-pass filter, cutoff = 20 Hz
b, a   = signal.butter(4, 20, btype="low", fs=fs)
filtered = signal.filtfilt(b, a, noisy)    # zero-phase filtering

plt.figure(figsize=(10, 4))
plt.plot(t, noisy,    alpha=0.4, label="Noisy",    lw=1)
plt.plot(t, filtered, color="#DC2626", label="Filtered", lw=2)
plt.xlabel("Time (s)");  plt.ylabel("Signal")
plt.title("Butterworth Low-Pass Filter")
plt.legend();  plt.grid(True, alpha=0.3)
plt.show()
SciPy one-liner reference: the ones engineers actually search for:

scipy.integrate.quad(f, a, b): definite integral of function f from a to b
scipy.optimize.minimize(f, x0): minimise a scalar function (e.g. drag)
scipy.optimize.brentq(f, a, b): find root of f in [a,b] (e.g. zero-lift angle)
scipy.interpolate.interp1d(x, y): interpolate between data points
scipy.stats.ttest_ind(a, b): t-test: are two datasets statistically different?
scipy.stats.norm.ppf(0.95): 95th percentile of normal distribution
Reference 02 · Advanced Engineering

Debugging & Profiling

Debugging and profiling are as important as writing new code. This section covers how to find errors fast and measure where time is actually being spent.

SECTION R2.1
The most common silent bugs in engineering code
silent_bugs.py: wrong answers with no error message
import numpy as np

# BUG 1: Integer division gives wrong answer silently
E   = 200000   # MPa
eps = 3        # strain — but these are both ints!
stress = E * eps / 1000    # Python 3: gives 0.6 ✓
# In Python 2 or C++: 200000 * 3 / 1000 = 600, not 0.6
# In C++: int/int = int — use 1000.0 to force float division

# BUG 2: Comparing floats with == (almost always wrong)
a = 0.1 + 0.2
print(a == 0.3)              # False! (floating point)
print(np.isclose(a, 0.3))   # True  ← use this instead

# BUG 3: Mutating a list you think you're copying
original = [1, 2, 3]
copy     = original          # NOT a copy — same object!
copy[0]  = 99
print(original)              # [99, 2, 3] — original changed!
# Fix:
copy     = original.copy()  # or list(original) or original[:]
# NumPy:
arr2 = arr.copy()           # NOT arr2 = arr

# BUG 4: Degrees vs radians (kills calculations silently)
angle = 45                     # degrees
print(np.sin(angle))          # 0.8509 — WRONG, treats as radians
print(np.sin(np.radians(45))) # 0.7071 — correct

# BUG 5: NaN propagation — one bad value corrupts everything
data = np.array([1.0, 2.0, np.nan, 4.0])
print(np.mean(data))     # nan — mean fails silently
print(np.nanmean(data))  # 2.333 — use nan-safe versions
print(np.any(np.isnan(data)))  # True — always check your data
SECTION R2.2
Timing your code: find what's slow
profiling.py
import time
import numpy as np

# Simple timing with perf_counter (high resolution)
start = time.perf_counter()

# --- your code here ---
result = np.sum(np.random.randn(1_000_000))

elapsed = time.perf_counter() - start
print(f"Elapsed: {elapsed*1000:.2f} ms")

# --- Python loop vs NumPy --- why it matters
data = np.random.randn(100_000)

# Slow: Python loop
t0 = time.perf_counter()
total = 0
for x in data:
    total += x ** 2
print(f"Loop:  {(time.perf_counter()-t0)*1000:.1f} ms")

# Fast: NumPy vectorised
t0 = time.perf_counter()
total = np.sum(data ** 2)
print(f"NumPy: {(time.perf_counter()-t0)*1000:.1f} ms")
# NumPy is typically 50-200x faster on large arrays

# Full profiling with cProfile (run from terminal)
# python -m cProfile -s cumulative your_script.py
SECTION R2.3
Assertions: catch wrong answers before they become wrong reports
Assertions are sanity checks you embed in your code. They cost nothing when correct and crash immediately when something is wrong: far better than getting a wrong number at the end.
assertions.py: defensive engineering code
import numpy as np

def compute_stress(force_N, area_mm2):
    # Guard against obviously wrong inputs
    assert force_N >= 0,    "Force must be non-negative"
    assert area_mm2 > 0,   "Area must be positive"
    stress = force_N / area_mm2
    assert stress < 2000, f"Stress {stress:.1f} MPa exceeds material limit"
    return stress

# Validate array shapes before matrix operations
A = np.ones((3, 3))
b = np.ones(3)
assert A.shape[1] == b.shape[0], "Matrix dimensions must match"

# Check no NaN crept in after operations
result = np.linalg.solve(A, b)
assert not np.any(np.isnan(result)), "NaN in result — check input matrix"
print("All checks passed.")
SECTION R2.4
Reading error messages: what they actually mean
ErrorWhat it meansHow to fix it
NameError: name 'x' is not definedYou used a variable before creating it, or misspelled itCheck spelling. Check you assigned it before using it.
TypeError: unsupported operand typeMaths on incompatible types: e.g. int + stringConvert types: int(x), float(x), str(x)
ValueError: could not convert string to floatTried to convert "abc" or "" to a numberCheck your data for non-numeric values in numeric columns
IndexError: list index out of rangeAccessed index that doesn't exist: e.g. list[5] on a 3-item listCheck length first: len(x). Remember index starts at 0.
KeyError: 'column_name'Column doesn't exist in DataFramePrint df.columns to see actual names. Check for spaces.
LinAlgError: Singular matrixMatrix has no inverse: system has no unique solutionCheck your stiffness matrix for unconstrained DOFs
shapes (3,) and (4,) not alignedNumPy matrix dimension mismatchPrint .shape on both arrays before the operation
segmentation fault (C++)Accessed memory you don't own: usually array out of boundsCheck all array indices. Use at() instead of [] for bounds checking.
undefined reference to (C++)Function declared but not defined, or missing library linkAdd the implementation, or link the library: -lm, -lopencv_core
Reference 03 · Advanced Engineering

Units, Constants & Pitfalls

Unit errors are silent and catastrophic. The Mars Climate Orbiter is the most expensive example. This section covers how to handle units correctly in Python.

SECTION R3.1
Physical constants in Python
constants.py: never hardcode these, use scipy.constants
from scipy import constants as c
import numpy as np

# Physical constants (exact scipy values)
print(c.g)          # 9.80665    m/s²  standard gravity
print(c.R)          # 8.314...   J/(mol·K) universal gas constant
print(c.atm)        # 101325     Pa    standard atmosphere
print(c.speed_of_light) # 299792458 m/s
print(c.pi)         # same as np.pi

# Aerospace / mechanical constants — define these at top of every script
g0    = 9.80665     # m/s²  — standard gravity (ISA)
R_air = 287.058    # J/(kg·K) — specific gas constant for dry air
gamma = 1.4        # —  ratio of specific heats for air
P_sl  = 101325.0   # Pa   — sea level pressure (ISA)
T_sl  = 288.15     # K    — sea level temperature (ISA)
rho_sl= 1.225      # kg/m³ — sea level density (ISA)
a_sl  = np.sqrt(gamma * R_air * T_sl) # 340.3 m/s — speed of sound

# Unit conversions — name them clearly
MPa_to_psi  = 145.038
psi_to_MPa  = 1 / MPa_to_psi
kts_to_ms   = 0.514444
ms_to_kts   = 1 / kts_to_ms
ft_to_m     = 0.3048
m_to_ft     = 1 / ft_to_m
lbf_to_N    = 4.44822
deg_to_rad  = np.pi / 180    # or use np.radians()
SECTION R3.2
Formatting engineering numbers correctly
Reporting 3.141592653589793 in an engineering report is wrong. Knowing how to format numbers: significant figures, scientific notation, fixed decimal places: is part of professional engineering output.
formatting.py
# Fixed decimal places
stress = 247.38291
print(f"{stress:.2f}")        # 247.38        (2 decimal places)
print(f"{stress:.0f}")        # 247            (round to integer)

# Significant figures
val = 0.0001234567
print(f"{val:.3g}")           # 0.000123       (3 sig figs)
print(f"{val:.4g}")           # 0.0001235

# Scientific notation
Re = 4_500_000
print(f"{Re:.2e}")            # 4.50e+06
print(f"{Re:,.0f}")           # 4,500,000      (thousands separator)

# Width padding — align columns
data = [("Steel", 400, 210), ("Aluminium", 270, 69), ("CFRP", 600, 150)]
print(f"{'Material':12} {'UTS (MPa)':>10} {'E (GPa)':>10}")
print("-" * 34)
for name, uts, E in data:
    print(f"{name:12} {uts:>10.0f} {E:>10.0f}")
SECTION R3.3
The pint library: automatic unit tracking
The safest approach: use pint to attach units to your numbers. It will raise an error if you accidentally add metres to feet, or multiply stress by the wrong area units. Install with pip install pint.
units_pint.py: unit-safe calculations
from pint import UnitRegistry

ureg = UnitRegistry()
Q    = ureg.Quantity

# Attach units to values
force  = Q(10000, "N")
area   = Q(50,    "mm**2")

# Arithmetic is unit-aware
stress = force / area
print(stress)               # 200.0 N/mm²
print(stress.to("MPa"))     # 200.0 MPa  (N/mm² = MPa)
print(stress.to("psi"))     # 29007.5 psi

# Unit mismatch raises error immediately
height = Q(100, "ft")
# force + height  →  DimensionalityError: cannot add [force] and [length]

# Convert velocity
speed = Q(250, "m/s")
print(speed.to("knots"))   # 486.0 knots
print(speed.to("mph"))     # 559.2 mph
SECTION R3.4
NumPy interpolation: look up values between data points
np.interp is one of the most-used functions in engineering code. Look up aerodynamic coefficients from a table, find material properties at a temperature not in your dataset, interpolate standard atmosphere values.
interpolation.py
import numpy as np

# Cl vs AoA lookup table (from wind tunnel data)
aoa_data = np.array([ 0,    2,    4,    6,    8,   10,   12])
CL_data  = np.array([ 0.18, 0.42, 0.65, 0.88, 1.05, 1.18, 1.22])

# Interpolate at arbitrary angle
aoa_query = 5.5
CL_at_5p5 = np.interp(aoa_query, aoa_data, CL_data)
print(f"CL at {aoa_query}°: {CL_at_5p5:.4f}")

# Interpolate multiple angles at once
query_angles = np.array([1, 3, 5, 7, 9, 11])
CL_interp    = np.interp(query_angles, aoa_data, CL_data)
print(CL_interp)

# scipy for more control (cubic spline — smoother)
from scipy.interpolate import CubicSpline
cs = CubicSpline(aoa_data, CL_data)
print(cs(5.5))                   # smoother than linear interp
print(cs(5.5, 1))               # derivative: dCL/dα at 5.5°
Reference 04

Pro Cheat Sheet

NumPy, pandas, matplotlib, and C++: condensed into a single reference page.

NUMPY: EVERY OPERATION YOU NEED
TaskCodeNotes
Create arraynp.array([1, 2, 3])Basic creation
Range of floatsnp.linspace(0, 10, 100)100 points from 0 to 10
Range with stepnp.arange(0, 10, 0.5)0, 0.5, 1.0 … 9.5
Zeros / Onesnp.zeros(n) / np.ones(n)Fill array with 0 or 1
Identity matrixnp.eye(n)n×n identity
Element-wise mathsa + b, a * b, a ** 2No loop needed
Matrix multiplyA @ BNOT A * B (that's element-wise)
Dot productnp.dot(a, b)Or a @ b for vectors
TransposeA.T
Inversenp.linalg.inv(A)Use solve() instead when possible
Solve Ax = bnp.linalg.solve(A, b)Faster and more stable than inv()
Eigenvaluesnp.linalg.eig(A)Returns values and vectors
Mean / Stdnp.mean(x) / np.std(x, ddof=1)ddof=1 for sample
Min / Maxnp.min(x) / np.max(x)
Index of min/maxnp.argmin(x) / np.argmax(x)Returns index, not value
Sortnp.sort(x)Returns sorted copy
Filter by conditionx[x > threshold]Boolean indexing
Count condition truenp.sum(x > threshold)
Replace valuesnp.where(x > 0, x, 0)Clip negatives to zero
Clamp to rangenp.clip(x, low, high)
Interpolatenp.interp(x_new, x, y)Linear interpolation
Cumulative sumnp.cumsum(x)Running total
Gradientnp.gradient(y, x)Numerical derivative dy/dx
Trapezoid integralnp.trapz(y, x)Numerical integration
Linear fitnp.polyfit(x, y, 1)Returns [slope, intercept]
Evaluate polynp.polyval(coeffs, x)Use after polyfit
Trig (degrees)np.sin(np.radians(angle))Always convert first
Safe float comparenp.isclose(a, b)Never use == on floats
Check for NaNnp.isnan(x) / np.any(np.isnan(x))Always check after loading data
NaN-safe meannp.nanmean(x)Ignores NaN values
Reshapex.reshape(3, 4)3 rows, 4 cols
Flatten to 1Dx.flatten()
Stack arraysnp.vstack([a, b]) / np.hstack([a, b])Vertical / horizontal
Log base 10np.log10(x)For S-N curves, dB
Natural lognp.log(x)
PANDAS: DATA ANALYSIS ONE-LINERS
TaskCodeNotes
Load CSVpd.read_csv("file.csv")
Load Excelpd.read_excel("file.xlsx", sheet_name="Results")
First 5 rowsdf.head()
Summary statsdf.describe()count, mean, std, min, max
Column namesdf.columns.tolist()
Missing valuesdf.isnull().sum()Per column
Drop missing rowsdf.dropna(subset=["col"])
Fill missingdf["col"].fillna(0)Or fillna(df["col"].mean())
Filter rowsdf[df["col"] > value]
Multiple conditionsdf[(df["a"] > 1) & (df["b"] == "X")]Parentheses required
Select columnsdf[["col1", "col2"]]Double brackets = DataFrame
Add columndf["new"] = df["a"] / df["b"]
Rename columndf.rename(columns={"old":"new"})
Sortdf.sort_values("col", ascending=False)
Group statisticsdf.groupby("cat")["val"].mean()Like pivot table
Multiple aggregationsdf.groupby("cat").agg(m=("val","mean"), s=("val","std"))
Merge two DataFramespd.merge(df1, df2, on="id")Like VLOOKUP
Save to CSVdf.to_csv("out.csv", index=False)
Save to Exceldf.to_excel("out.xlsx", index=False)
Multi-sheet Excelwith pd.ExcelWriter("out.xlsx") as w: df.to_excel(w, sheet_name="S1")
MATPLOTLIB: PLOT FORMATTING REFERENCE
ElementCodeNotes
Figure sizeplt.figure(figsize=(10, 6))Width × height in inches
Line plotplt.plot(x, y, color="steelblue", lw=2, ls="--")ls: "-", "--", ":", "-."
Scatter plotplt.scatter(x, y, s=40, c="red", alpha=0.6)s=marker size
Bar chartplt.bar(categories, values)
Histogramplt.hist(data, bins=30, edgecolor="white")
Horizontal lineplt.axhline(y=value, color="red", ls="--")Limit lines, thresholds
Vertical lineplt.axvline(x=value, color="gray", ls=":")
Axis labelsplt.xlabel("Time (s)", fontsize=12)Always label axes
Titleplt.title("My Plot", fontsize=14)
Legendplt.legend(loc="upper right", fontsize=10)Needs label= in plot()
Gridplt.grid(True, alpha=0.3)alpha controls opacity
Axis limitsplt.xlim(0, 100) / plt.ylim(0, 500)
Log scaleplt.xscale("log") / plt.yscale("log")S-N curves, frequency plots
Annotationplt.annotate("text", xy=(x,y), xytext=(tx,ty), arrowprops=dict(arrowstyle="->"))
Tight layoutplt.tight_layout()Prevents label clipping
Save high-resplt.savefig("plot.png", dpi=300, bbox_inches="tight")300 dpi for reports
Subplotsfig, (ax1, ax2) = plt.subplots(2, 1, sharex=True)2 rows, shared x axis
Colour map scattersc = plt.scatter(x, y, c=z, cmap="plasma"); plt.colorbar(sc)3rd variable as colour
C++ QUICK REFERENCE: PRACTICAL DAY-TO-DAY
TaskCodeNotes
Compile + rung++ -O2 -std=c++17 -o out file.cpp && ./out-O2 = optimise, -std=c++17 = modern C++
Compile with mathsg++ -O2 file.cpp -lm -o outNeeded for sqrt, sin etc
Integer to float divisiondouble r = (double)a / b;Cast one to double first
Read file line by lineifstream f("data.txt"); while(getline(f, line)){…}
Write to fileofstream f("out.txt"); f << value << "\n";
Vector (dynamic array)vector<double> v = {1.0, 2.0}; v.push_back(3.0);Use instead of raw arrays
Loop over vectorfor (double x : v) { cout << x; }Range-based for
Format outputcout << fixed << setprecision(3) << val;#include <iomanip>
String to numberdouble x = stod("3.14"); int n = stoi("42");
Number to stringstring s = to_string(42);
Max / Min of twomax(a, b) / min(a, b)#include <algorithm>
Abs valueabs(x) / fabs(x)fabs for floats
Powerpow(base, exp)#include <cmath>
Struct (data bundle)struct Point { double x, y; }; Point p = {1.0, 2.0};Group related data
Check file openedif (!file.is_open()) { cerr << "Error"; return 1; }Always check
GOLDEN RULES: THINGS THAT WILL SAVE YOU HOURS
01: Always print .shape before matrix operations. 90% of numpy errors are shape mismatches. One print statement prevents an hour of debugging.

02: Never use == with floats. Use np.isclose(a, b) or abs(a - b) < 1e-9.

03: Name your constants. g0 = 9.80665 at the top of the file. Never write 9.81 or 9.8 directly in a formula.

04: arr2 = arr does NOT copy. Use arr2 = arr.copy() in NumPy. For lists: lst2 = lst.copy().

05: Trig always in radians. Wrap every degree value: np.sin(np.radians(45)). Make it a habit.

06: Check for NaN before computing stats. np.any(np.isnan(data)) first. One NaN poisons every mean, std, and sum.

07: Save figures before plt.show(). savefig must come before show() or you'll get a blank file.

08: Use ddof=1 for sample std dev. Your test specimens are a sample, not the entire population.

09: Comment units, not just values. force = 10000 # N: future you will thank present you.

10: Use pathlib not os.path. from pathlib import Path: cleaner syntax, works on Windows and Linux identically.
Aviation · AV01

ISA Atmosphere Model

The International Standard Atmosphere is the foundation of every performance calculation in aviation. Before you compute fuel burn, range, or climb rate: you need density, pressure, temperature, and speed of sound at altitude. This is how you compute them correctly in Python and C++.

SECTION AV1.1
Complete ISA model: troposphere and stratosphere
The troposphere (0–11 km / FL360) has a constant lapse rate. The stratosphere (11–20 km) is isothermal. This model covers everything from sea level to FL660.
isa.py: full ISA model, vectorised for any altitude array
import numpy as np

# ISA constants
T0    = 288.15     # K    sea-level temperature
P0    = 101325.0   # Pa   sea-level pressure
rho0  = 1.225      # kg/m³ sea-level density
L     = 0.0065     # K/m  troposphere lapse rate
g0    = 9.80665    # m/s²
R     = 287.058    # J/(kg·K) dry air
gamma = 1.4
H_trop = 11000     # m — tropopause altitude
T_trop = T0 - L * H_trop   # 216.65 K at tropopause
P_trop = P0 * (T_trop / T0) ** (g0 / (L * R))

def isa(h_m):
    """
    ISA atmosphere for scalar or array altitude h_m (metres).
    Returns: T (K), P (Pa), rho (kg/m³), a (m/s)
    """
    h = np.atleast_1d(np.asarray(h_m, dtype=float))
    T   = np.where(h <= H_trop,
                  T0 - L * h,
                  T_trop)
    P   = np.where(h <= H_trop,
                  P0 * (T / T0) ** (g0 / (L * R)),
                  P_trop * np.exp(-g0 * (h - H_trop) / (R * T_trop)))
    rho = P / (R * T)
    a   = np.sqrt(gamma * R * T)
    return T.squeeze(), P.squeeze(), rho.squeeze(), a.squeeze()

# --- Usage ---
altitudes_ft = np.array([0, 10000, 20000, 30000, 35000, 39000])
altitudes_m  = altitudes_ft * 0.3048

T, P, rho, a = isa(altitudes_m)

print(f"{'Alt (ft)':10} {'T (K)':8} {'P (hPa)':10} {'rho (kg/m³)':13} {'a (kts)'}")
print("-"*55)
for i, alt in enumerate(altitudes_ft):
    print(f"{alt:10.0f} {T[i]:8.2f} {P[i]/100:10.2f} {rho[i]:13.4f} {a[i]/0.514444:7.1f}")
SECTION AV1.2
Mach number, TAS, CAS, EAS conversions
Speed in aviation has four forms. Knowing how to convert between them is essential for any performance calculation.
speed_conversions.py
def mach_to_tas(M, h_m):
    """True Airspeed from Mach and altitude"""
    T, _, _, a = isa(h_m)
    return M * a                             # m/s

def tas_to_eas(tas, h_m):
    """Equivalent Airspeed = TAS * sqrt(sigma)"""
    _, _, rho, _ = isa(h_m)
    sigma = rho / rho0
    return tas * np.sqrt(sigma)

def cas_to_tas(cas, h_m):
    """CAS to TAS using compressibility correction"""
    T, P, rho, _ = isa(h_m)
    # Subsonic isentropic relation
    qc = P0 * ((1 + 0.2 * (cas / (isa(0)[3]))**2) ** 3.5 - 1)
    M   = np.sqrt(5 * ((qc / P + 1) ** (2/7) - 1))
    _, _, _, a = isa(h_m)
    return M * a

# Example: A320 cruise at FL350, M0.78
h_cruise = 35000 * 0.3048      # FL350 in metres
M        = 0.78
TAS      = mach_to_tas(M, h_cruise)
EAS      = tas_to_eas(TAS, h_cruise)

print(f"FL350, M{M}:")
print(f"  TAS = {TAS:.1f} m/s  ({TAS/0.514444:.1f} kts)")
print(f"  EAS = {EAS:.1f} m/s  ({EAS/0.514444:.1f} kts)")
Output at FL350, M0.78:
TAS = 232.5 m/s (452.0 kts)  ·  EAS = 130.4 m/s (253.5 kts)
The density ratio at FL350 is ~0.31: EAS is TAS × √0.31 = 0.557 × TAS. This is why aircraft stall at the same EAS regardless of altitude.
Aviation · AV02

Cost Index & Fuel Burn

Cost Index (CI) is the single number that tells the FMS how to trade time cost against fuel cost. CI = 0 means fly for minimum fuel. CI = max means fly as fast as possible regardless of fuel. Every airline optimises CI per route, per day, per aircraft type. Here's how to model it.

SECTION AV2.1
What Cost Index actually is
The FMS minimises total operating cost per flight: Cost = CI × Time + Fuel. When CI = 0, only fuel matters: fly at MRC (Maximum Range Cruise). When CI is high, time cost dominates: fly near VMO. The optimal cruise Mach for a given CI sits between these extremes.
cost_index.py: CI optimisation and Mach selection
import numpy as np
import matplotlib.pyplot as plt

# Aircraft parameters (A320-like)
W       = 65000    # kg  cruise weight
S       = 122.6    # m²  wing area
CD0     = 0.0240   # zero-lift drag
k       = 0.0375   # induced drag factor (1/π·AR·e)
eta     = 0.30     # overall propulsive efficiency (cruise)
LHV     = 43.2e6   # J/kg  Jet-A lower heating value

# Cruise conditions
h_m     = 35000 * 0.3048
T, P, rho, a = isa(h_m)

# Mach sweep
M_range = np.linspace(0.65, 0.84, 200)
V       = M_range * a                             # TAS, m/s
q       = 0.5 * rho * V**2                        # dynamic pressure
CL      = W * 9.80665 / (q * S)                  # level flight CL
CD      = CD0 + k * CL**2                         # total drag
D       = q * S * CD                               # drag force, N
ff_kgs  = D * V / (eta * LHV)                    # fuel flow, kg/s
ff_kghr = ff_kgs * 3600                             # fuel flow, kg/hr
SR      = V / ff_kgs / 1000                       # specific range, km/kg

# Cost Index analysis
# CI units: kg/min  (fuel cost equivalent of 1 min of time)
def opt_mach_for_ci(CI_kgmin):
    """Find Mach that minimises total cost for given CI"""
    time_cost_per_km = CI_kgmin / (V * 60 / 1000)   # kg_eq/km from time
    fuel_cost_per_km = ff_kgs / (V / 1000)           # kg/km
    total_cost       = time_cost_per_km + fuel_cost_per_km
    idx              = np.argmin(total_cost)
    return M_range[idx], ff_kghr[idx]

print(f"{'CI (kg/min)':14} {'Opt Mach':12} {'Fuel Flow (kg/hr)'}")
print("-"*42)
for CI in [0, 10, 20, 40, 60, 80, 100]:
    M_opt, ff_opt = opt_mach_for_ci(CI)
    print(f"{CI:14} {M_opt:12.3f} {ff_opt:.1f}")

# Plot: fuel flow and SR vs Mach
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(M_range, ff_kghr, color="#2563EB", lw=2)
ax1.set_xlabel("Mach");  ax1.set_ylabel("Fuel Flow (kg/hr)")
ax1.set_title("Fuel Flow vs Mach");  ax1.grid(True, alpha=0.3)
ax2.plot(M_range, SR,      color="#16A34A", lw=2)
ax2.set_xlabel("Mach");  ax2.set_ylabel("Specific Range (km/kg)")
ax2.set_title("Specific Range vs Mach");  ax2.grid(True, alpha=0.3)
plt.tight_layout();  plt.show()
SECTION AV2.2
Breguet range equation: how far can you go?
The Breguet equation gives range from fuel weight, aerodynamic efficiency, and specific fuel consumption. It's the fundamental equation of airline route planning.
breguet.py
import numpy as np

def breguet_range(V_ms, LD, SFC_kgNs, W_initial, W_final):
    """
    Breguet range equation.
    V_ms      : cruise TAS, m/s
    LD        : cruise lift-to-drag ratio
    SFC_kgNs  : specific fuel consumption, kg/(N·s)
    W_initial : initial cruise weight, N
    W_final   : final cruise weight (= W_initial - W_fuel), N
    Returns   : range in km
    """
    return (V_ms / (SFC_kgNs * 9.80665)) * LD * np.log(W_initial / W_final) / 1000

# A320neo-like parameters
MTOW    = 79000    # kg — max takeoff weight
OEW     = 42600    # kg — operating empty weight
payload = 15000    # kg — passengers + bags (~150 pax)
fuel    = 18000    # kg — usable fuel

h_m     = 35000 * 0.3048
_, _, _, a = isa(h_m)
V_cruise= 0.78 * a         # TAS at M0.78, FL350
LD      = 17.5              # typical A320neo cruise L/D
SFC     = 1.65e-5           # kg/(N·s) — LEAP-1A26 typical

W_init  = (OEW + payload + fuel) * 9.80665
W_final = (OEW + payload) * 9.80665

R = breguet_range(V_cruise, LD, SFC, W_init, W_final)
print(f"Cruise Mach:     M0.78")
print(f"TAS:             {V_cruise/0.514444:.1f} kts")
print(f"L/D:             {LD}")
print(f"Fuel burned:     {fuel:,} kg")
print(f"Range (Breguet): {R:.0f} km  ({R/1.852:.0f} nm)")

# Sensitivity: range vs payload (range-payload curve)
payloads  = np.linspace(0, 20000, 100)
ranges    = []
for pl in payloads:
    fuel_avail = MTOW - OEW - pl        # fuel limited by MTOW
    fuel_avail = np.clip(fuel_avail, 0, 18000)
    Wi = (OEW + pl + fuel_avail) * 9.80665
    Wf = (OEW + pl) * 9.80665
    ranges.append(breguet_range(V_cruise, LD, SFC, Wi, Wf))

import matplotlib.pyplot as plt
plt.figure(figsize=(9, 5))
plt.plot(payloads/1000, ranges, color="#2563EB", lw=2)
plt.xlabel("Payload (tonnes)");  plt.ylabel("Range (km)")
plt.title("Range–Payload Curve (A320neo-like)")
plt.grid(True, alpha=0.3);  plt.tight_layout();  plt.show()
Why this matters operationally: An airline dispatcher uses exactly this model to decide whether to carry extra fuel (tankering) when destination fuel is expensive. If the cost of carrying extra fuel weight exceeds the saving at destination, don't tanker. The Breguet equation gives you the exact range penalty per kg of extra fuel carried.
Aviation · AV03

Aircraft Performance

Performance engineering calculates what an aircraft can actually do: climb rate, cruise ceiling, takeoff distance, fuel to destination. These are the numbers that go into flight manuals, dispatch releases, and weight & balance calculations.

SECTION AV3.1
Climb performance: rate of climb and time to altitude
climb.py: ROC, time, and fuel to climb
import numpy as np
from scipy.integrate import solve_ivp

# Aircraft (A320-like at typical climb weight)
W_N   = 70000 * 9.80665   # N
S     = 122.6               # m²
CD0   = 0.0280              # climb config (no flaps)
k     = 0.0375
SFC   = 1.80e-5             # kg/(N·s) climb SFC (higher than cruise)

# Thrust model: T = T_sl * (rho/rho0)^0.75  (simplified)
T_SL  = 120000              # N  — total SL static thrust (2 × CFM56)

def climb_ode(t, y):
    """y = [altitude m, fuel burned kg]"""
    h, mf = y
    T_ISA, P, rho, a = isa(h)
    T    = T_SL * (rho / 1.225) ** 0.75      # thrust at altitude
    M    = 0.76                                 # constant Mach climb above 10000ft
    V    = M * a
    q    = 0.5 * rho * V**2
    CL   = W_N / (q * S)
    CD   = CD0 + k * CL**2
    D    = q * S * CD
    ROC  = V * (T - D) / W_N               # m/s rate of climb
    ff   = T * SFC                             # kg/s fuel flow
    return [ROC, ff]

# Integrate from 3000ft to FL350
h_init = 3000 * 0.3048
h_end  = 35000 * 0.3048

# Stop event: when altitude reaches FL350
def reached_cruise(t, y): return y[0] - h_end
reached_cruise.terminal = True

sol = solve_ivp(climb_ode, [0, 3600], [h_init, 0],
                events=reached_cruise, max_step=10,
                dense_output=True)

t_climb    = sol.t_events[0][0]
fuel_climb = sol.y_events[0][0][1]

print(f"Time to climb FL030→FL350: {t_climb/60:.1f} min")
print(f"Fuel to climb:             {fuel_climb:.0f} kg")
print(f"Average ROC:               {(h_end-h_init)/t_climb*196.85:.0f} fpm")
SECTION AV3.2
Cruise ceiling and step-climb optimisation
step_climb.py: find optimum altitude as fuel burns off
import numpy as np

def opt_altitude(W_kg, M=0.78):
    """
    Find altitude that maximises specific range for given weight and Mach.
    Uses simplified drag model.
    """
    altitudes = np.linspace(25000, 43000, 200) * 0.3048
    best_SR, best_alt = 0, altitudes[0]
    for h in altitudes:
        T, P, rho, a = isa(h)
        V  = M * a
        q  = 0.5 * rho * V**2
        CL = W_kg * 9.80665 / (q * 122.6)
        CD = 0.0240 + 0.0375 * CL**2
        D  = q * 122.6 * CD
        ff = D * V / (0.30 * 43.2e6)
        SR = V / ff / 1000
        if SR > best_SR:
            best_SR, best_alt = SR, h
    return best_alt / 0.3048, best_SR

# Track optimum altitude as fuel burns off during flight
weights = np.arange(75000, 52000, -1000)   # kg, fuel burning off
print(f"{'Weight (kg)':14} {'Opt Alt (ft)':15} {'Spec. Range (km/kg)'}")
print("-"*46)
for W in weights[::5]:           # print every 5th step
    alt, sr = opt_altitude(W)
    print(f"{W:14,} {alt:15.0f} {sr:.4f}")
Step climb logic: As fuel burns and the aircraft gets lighter, the optimum altitude increases. Airlines request step climbs from ATC (FL350 → FL370 → FL390) to track the optimum. Each step saves roughly 0.5–1.5% fuel on a long-haul flight: significant on a 15-hour sector.
Aviation · AV04

Structural Loads & Fatigue

Every flight imposes loads on the airframe: manoeuvres, gusts, landing impacts. Fatigue is the accumulation of damage from repeated loads over thousands of flights. These tools let you build V-n diagrams, plot S-N curves, and apply Miner's rule for damage summation.

SECTION AV4.1
V-n diagram: the structural flight envelope
vn_diagram.py: CS-25 transport category
import numpy as np
import matplotlib.pyplot as plt

# Aircraft parameters
W     = 70000    # kg  — design weight
S     = 122.6    # m²
CLmax = 1.52     # clean configuration
CLmin = -0.80    # negative lift limit

# CS-25 load factor limits
n_lim_pos = 2.5   # positive limit (CS25.337)
n_lim_neg = -1.0  # negative limit
n_ult_pos = n_lim_pos * 1.5   # 1.5 × limit = ultimate

# Reference conditions (sea level ISA)
rho   = 1.225
g     = 9.80665
W_N   = W * g

# Stall speed at 1g (EAS)
Vs    = np.sqrt(2 * W_N / (rho * S * CLmax))   # m/s EAS
Vs_kts= Vs / 0.514444

# Manoeuvre speed Va = Vs * sqrt(n_lim)
Va    = Vs * np.sqrt(n_lim_pos)
Vc    = 175        # m/s EAS — design cruise speed
Vd    = 210        # m/s EAS — design dive speed

# Stall boundary curves
V_range = np.linspace(0, Vd, 500)
n_pos   = (rho * V_range**2 * S * CLmax) / (2 * W_N)   # positive stall
n_neg   = (rho * V_range**2 * S * CLmin) / (2 * W_N)   # negative stall

fig, ax = plt.subplots(figsize=(10, 6))

# Positive stall boundary (up to n_lim)
mask_pos = n_pos <= n_lim_pos
ax.plot(V_range[mask_pos] / 0.514444, n_pos[mask_pos],
        color="#2563EB", lw=2, label="Stall boundary")

# Structural limits
ax.axhline(n_lim_pos, color="#DC2626", lw=1.5, ls="--", label=f"Limit load n={n_lim_pos}g")
ax.axhline(n_ult_pos, color="#DC2626", lw=1,   ls=":",  label=f"Ultimate load n={n_ult_pos}g")
ax.axhline(n_lim_neg, color="#2563EB", lw=1.5, ls="--", label=f"Negative limit n={n_lim_neg}g")

# Vertical speed limits
ax.axvline(Va  / 0.514444, color="gray", lw=1, ls=":", label=f"Va = {Va/0.514444:.0f} kts")
ax.axvline(Vc  / 0.514444, color="green", lw=1, ls=":", label=f"Vc = {Vc/0.514444:.0f} kts")
ax.axvline(Vd  / 0.514444, color="red",   lw=1, ls=":", label=f"Vd = {Vd/0.514444:.0f} kts")

ax.fill_between(V_range[mask_pos] / 0.514444,
                 n_pos[mask_pos], 0, alpha=0.07, color="#2563EB")
ax.set_xlabel("EAS (knots)");  ax.set_ylabel("Load Factor n (g)")
ax.set_title("V-n Diagram — CS-25 Transport")
ax.legend(fontsize=9);  ax.grid(True, alpha=0.3)
ax.set_xlim(0, Vd/0.514444 + 20);  ax.set_ylim(-1.5, 4.5)
plt.tight_layout();  plt.show()
print(f"1g stall speed: {Vs_kts:.1f} kts EAS")
print(f"Va (manoeuvre): {Va/0.514444:.1f} kts EAS")
SECTION AV4.2
Miner's rule: cumulative fatigue damage
miners_rule.py: fleet life monitoring
import numpy as np

# S-N curve for Al 2024-T3 (log-log linear fit from AV3 regression)
# log10(N) = a - b * log10(sigma)
a_sn = 15.2
b_sn = 4.8

def cycles_to_failure(sigma_MPa):
    """N from S-N curve"""
    return 10 ** (a_sn - b_sn * np.log10(sigma_MPa))

# Mission spectrum: typical short-haul aircraft
# Each row: [stress amplitude MPa, occurrences per flight]
spectrum = np.array([
    [250,  1   ],   # rotation / landing
    [180,  3   ],   # turbulence moderate
    [120,  8   ],   # light turbulence
    [80,   25  ],   # manoeuvres
    [50,   100 ],   # pressurisation cycles within flight
    [30,   500 ],   # minor vibration
])

sigma = spectrum[:, 0]
n_occ = spectrum[:, 1]
N_f   = cycles_to_failure(sigma)
D_per_flight = np.sum(n_occ / N_f)   # Miner's sum per flight
life_flights  = 1 / D_per_flight       # flights until D = 1.0

print(f"{'Stress (MPa)':15} {'n/flight':12} {'N_f':15} {'n/N (damage)'}")
print("-"*55)
for i in range(len(sigma)):
    print(f"{sigma[i]:15.0f} {n_occ[i]:12.0f} {N_f[i]:15.2e} {n_occ[i]/N_f[i]:.2e}")
print(f"\nDamage per flight: {D_per_flight:.2e}")
print(f"Predicted life:    {life_flights:,.0f} flights")
print(f"                   {life_flights/365:.0f} years at 1 flight/day")

# Current fleet status
flights_flown   = 12500
damage_accrued  = flights_flown * D_per_flight
remaining_life  = (1 - damage_accrued) / D_per_flight
print(f"\nFlights flown:     {flights_flown:,}")
print(f"Damage accrued:    {damage_accrued:.3f}  ({damage_accrued*100:.1f}%)")
print(f"Remaining life:    {remaining_life:,.0f} flights")
Aviation · AV05

Aerodynamic Data Analysis

Wind tunnel and flight test produce tables of numbers. This module shows how to extract aerodynamic coefficients, build drag polars, fit lift curves, and compute induced drag efficiency: all from raw data.

SECTION AV5.1
Drag polar: extracting CD0 and k from flight test data
drag_polar.py: fit parabolic drag polar from flight test points
import numpy as np
import matplotlib.pyplot as plt

# Flight test data: (CL, CD) pairs from level-flight test points
# Each row is one test condition (different weight/altitude/speed)
CL_data = np.array([0.20, 0.30, 0.40, 0.50, 0.60,
                     0.70, 0.80, 0.90, 1.00, 1.10])
CD_data = np.array([0.0258, 0.0268, 0.0284, 0.0311, 0.0347,
                     0.0393, 0.0454, 0.0529, 0.0625, 0.0741])

# Fit CD = CD0 + k * CL^2  →  linear regression on (CL², CD)
CL2      = CL_data**2
coeffs   = np.polyfit(CL2, CD_data, 1)
k_fit    = coeffs[0]
CD0_fit  = coeffs[1]

# Oswald efficiency factor
AR       = 9.4         # A320 aspect ratio
e_oswald = 1 / (np.pi * AR * k_fit)

print(f"Drag polar fit:")
print(f"  CD0   = {CD0_fit:.5f}")
print(f"  k     = {k_fit:.5f}")
print(f"  e     = {e_oswald:.3f}  (Oswald efficiency)")

# Maximum L/D and the CL at which it occurs
CL_maxLD = np.sqrt(CD0_fit / k_fit)
LD_max   = CL_maxLD / (2 * np.sqrt(CD0_fit * k_fit))
print(f"  CL at max L/D = {CL_maxLD:.3f}")
print(f"  Max L/D       = {LD_max:.1f}")

# Plot the polar
CL_curve = np.linspace(0, 1.3, 200)
CD_curve = CD0_fit + k_fit * CL_curve**2

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# CL vs CD
ax1.scatter(CD_data, CL_data, color="steelblue", s=50, zorder=5, label="Test data")
ax1.plot(CD_curve, CL_curve, "r-", lw=2, label=f"Fit: CD0={CD0_fit:.4f}, k={k_fit:.4f}")
ax1.axvline(2*np.sqrt(CD0_fit*k_fit), color="green", ls="--", lw=1, label=f"Min CD (max L/D)")
ax1.set_xlabel("CD");  ax1.set_ylabel("CL")
ax1.set_title("Drag Polar");  ax1.legend(fontsize=8);  ax1.grid(True, alpha=0.3)

# L/D vs CL
LD_curve = CL_curve / CD_curve
ax2.plot(CL_curve, LD_curve, color="#16A34A", lw=2)
ax2.axvline(CL_maxLD, color="red", ls="--", lw=1, label=f"Best CL={CL_maxLD:.3f}, L/D={LD_max:.1f}")
ax2.set_xlabel("CL");  ax2.set_ylabel("L/D")
ax2.set_title("Lift-to-Drag Ratio");  ax2.legend(fontsize=9);  ax2.grid(True, alpha=0.3)
plt.tight_layout();  plt.show()
Aviation · AV06

Flight Data Analysis · FDR & QAR

Every commercial aircraft records hundreds of parameters every second into the Flight Data Recorder (FDR) and Quick Access Recorder (QAR). Airlines use QAR data for safety monitoring, fuel analysis, and exceedance detection. This is how you process it with Python.

SECTION AV6.1
Loading and inspecting QAR data
load_qar.py: load, inspect, and clean flight data
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt

# QAR data is usually a CSV or binary file exported from the ACMS
# Typical columns: time, altitude, airspeed, heading, N1%, temp,
#                  fuel_flow, load_factor, pitch, roll, etc.
df = pd.read_csv("flight_WB123_20260322.csv")

# Basic inspection
print(f"Flight duration: {len(df)/4:.0f} seconds ({len(df)/4/60:.1f} min)")  # 4 Hz data
print(f"Parameters: {list(df.columns)}")
print(df.describe().round(2))

# Check for invalid/missing data
print("\nMissing values:")
print(df.isnull().sum())

# Replace clearly invalid sensor spikes (e.g. altitude = 0 mid-flight)
df["altitude_ft"] = df["altitude_ft"].replace(0, np.nan).interpolate()
SECTION AV6.2
Exceedance detection: the core of Flight Data Monitoring
Airlines run every flight through an FDM (Flight Data Monitoring) program that checks thousands of parameters against limits. Any exceedance triggers a review. Here's how to build that logic.
fdm_exceedances.py: automated safety monitoring
import pandas as pd
import numpy  as np

# Define exceedance thresholds (simplified A320 limits)
limits = {
    "load_factor_nz":  (-0.5,   2.0 ),   # g  — smooth air ops limits
    "pitch_deg":       (-5.0,   25.0),   # deg
    "bank_angle_deg":  (-30.0,  30.0),   # deg
    "N1_pct":          (0,      104.0),  # %
    "airspeed_kts":    (0,      350.0),  # kts below FL100
    "vsi_fpm":         (-2000, 2000 ),   # fpm vertical speed
}

exceedances = []

for param, (lo, hi) in limits.items():
    if param not in df.columns:
        continue
    col   = df[param]
    exceed= (col < lo) | (col > hi)

    if exceed.any():
        # Find contiguous exceedance windows
        changes = exceed.astype(int).diff().fillna(0)
        starts  = df.index[changes == 1]
        ends    = df.index[changes == -1]
        for s, e in zip(starts, ends):
            window   = col.loc[s:e]
            peak_val = window.max() if window.max() > hi else window.min()
            duration = (e - s) / 4              # seconds (4Hz data)
            exceedances.append({
                "parameter": param,
                "limit":     hi if peak_val > hi else lo,
                "peak_value": peak_val.round(2),
                "duration_s": duration,
                "time_s":     s / 4
            })

result = pd.DataFrame(exceedances)
if result.empty():
    print("No exceedances detected. Clean flight.")
else:
    print(f"{len(result)} exceedances detected:")
    print(result.to_string(index=False))
SECTION AV6.3
Fuel burn analysis: per-flight benchmarking
fuel_analysis.py: compare actual vs planned fuel burn across a fleet
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt

# Load month of flight records (from OPS system export)
flights = pd.read_csv("fleet_march2026.csv")
# Expected columns: flight_no, reg, route, dist_nm, fuel_plan_kg,
#                   fuel_actual_kg, pax, cargo_kg, wind_comp_kts

# Fuel delta: actual minus planned
flights["fuel_delta"]     = flights["fuel_actual_kg"] - flights["fuel_plan_kg"]
flights["fuel_delta_pct"] = flights["fuel_delta"] / flights["fuel_plan_kg"] * 100
flights["fuel_per_nm"]    = flights["fuel_actual_kg"] / flights["dist_nm"]

# Fleet summary
print("=== Fleet Fuel Performance — March 2026 ===")
print(f"Flights analysed:     {len(flights)}")
print(f"Mean fuel delta:      {flights.fuel_delta.mean():.0f} kg ({flights.fuel_delta_pct.mean():.1f}%)")
print(f"Std deviation:        {flights.fuel_delta.std():.0f} kg")
print(f"Worst flight:         +{flights.fuel_delta.max():.0f} kg")
print(f"Best flight:          {flights.fuel_delta.min():.0f} kg")

# Per-tail analysis (identify aircraft with chronic over-burn)
per_reg = flights.groupby("reg").agg(
    flights       = ("flight_no", "count"),
    mean_delta_kg = ("fuel_delta", "mean"),
    total_excess  = ("fuel_delta", "sum")
).round(1).sort_values("mean_delta_kg", ascending=False)
print("\nPer-aircraft fuel performance:")
print(per_reg.head(10))

# Correlation: wind vs fuel delta
corr = flights["wind_comp_kts"].corr(flights["fuel_delta"])
print(f"\nWind vs fuel delta correlation: {corr:.3f}")
# Strong negative correlation expected: headwind → more fuel

# Plot distribution of fuel delta
plt.figure(figsize=(9, 4))
plt.hist(flights["fuel_delta"], bins=40, color="steelblue", edgecolor="white")
plt.axvline(0, color="red", lw=1.5, label="Plan")
plt.axvline(flights.fuel_delta.mean(), color="orange", lw=1.5, ls="--", label=f"Mean: {flights.fuel_delta.mean():.0f} kg")
plt.xlabel("Fuel Delta (kg)");  plt.ylabel("Flights")
plt.title("Actual vs Planned Fuel — March 2026")
plt.legend();  plt.tight_layout();  plt.show()
Aviation Python quick reference: essential one-liners:

isa(h_m) → T, P, rho, a at any altitude
mach_to_tas(M, h) → TAS in m/s
breguet_range(V, LD, SFC, Wi, Wf) → range in km
np.polyfit(CL**2, CD, 1) → drag polar coefficients [k, CD0]
np.sqrt(CD0/k) → CL at max L/D
df[df["nz"] > 2.0] → filter exceedances from QAR data
df.groupby("reg")["fuel_delta"].mean() → per-aircraft fuel performance
solve_ivp(eom, [0,T], y0, events=cruise_reached) → integrate climb to cruise
Data Processing · DP01

Cleaning Messy Data

Real data is broken. Missing values, impossible readings, duplicate rows, columns that are numbers stored as text. Every analysis starts here. Get this wrong and every result downstream is wrong too.

The first thing to run on any new dataset
Before touching the data, understand what you have. These five lines tell you everything that needs fixing.
audit.py: always run this first
import pandas as pd
import numpy  as np

df = pd.read_csv("sensor_data.csv")

print(df.shape)                    # (rows, columns)
print(df.dtypes)                   # spot columns stored as wrong type
print(df.isnull().sum())          # missing values per column
print(df.duplicated().sum())     # duplicate rows
print(df.describe())              # min/max will expose impossible values

# Spot columns that should be numeric but aren't
for col in df.select_dtypes('object').columns:
    sample = df[col].dropna().head(3).tolist()
    print(f"{col}: {sample}")
Handling missing values
Different strategies depending on why the data is missing. Dropping is not always the right answer.
missing_values.py
# Drop rows missing critical values
df = df.dropna(subset=["timestamp", "altitude_ft"])

# Fill sensor gaps with forward-fill (last known value)
df["temperature"] = df["temperature"].ffill()

# Fill with column mean (for scattered random gaps)
df["pressure"] = df["pressure"].fillna(df["pressure"].mean())

# Interpolate (smooth gap fill — good for time series)
df["airspeed"] = df["airspeed"].interpolate(method="linear")

# Check NaN-safe mean vs regular mean
print(np.nanmean(df["load_factor"]))   # ignores NaN
print(np.mean(df["load_factor"]))      # returns NaN if any NaN present
Outlier detection and removal
outliers.py
import numpy as np

col = df["altitude_ft"]

# Method 1: IQR fence (robust to skewed distributions)
Q1, Q3 = col.quantile([0.25, 0.75])
IQR     = Q3 - Q1
lo, hi  = Q1 - 1.5*IQR, Q3 + 1.5*IQR
df_clean = df[col.between(lo, hi)]
print(f"Removed {len(df) - len(df_clean)} outliers")

# Method 2: Z-score (assumes normal distribution)
from scipy import stats
z        = np.abs(stats.zscore(col.dropna()))
df_clean = df[z < 3]           # keep within 3 standard deviations

# Method 3: Physical bounds (best for engineering data)
# You know the sensor range — use it
df = df[df["altitude_ft"].between(-1000, 60000)]
df = df[df["airspeed_kts"].between(0, 600)]
df = df[df["load_factor"].between(-3, 5)]
Type conversion and unit standardisation
types_and_units.py
# Columns stored as strings when they should be numbers
df["speed"] = pd.to_numeric(df["speed"], errors="coerce")
# errors="coerce" turns unparseable values into NaN instead of crashing

# Mixed units in one column — e.g. "250 kts" and "128 m/s"
def parse_speed(val):
    val = str(val).strip()
    if "kts" in val:
        return float(val.replace("kts", "")) * 0.514444   # to m/s
    elif "m/s" in val:
        return float(val.replace("m/s", ""))
    return np.nan

df["speed_ms"] = df["speed_raw"].apply(parse_speed)

# Standardise column names (common with multi-source data)
df.columns = (df.columns
    .str.strip()
    .str.lower()
    .str.replace(" ", "_")
    .str.replace("(", "")
    .str.replace(")", "")
)

# Remove exact duplicate rows
df = df.drop_duplicates()
df = df.drop_duplicates(subset=["timestamp", "sensor_id"])
The cleaning checklist for every new dataset:
1. Check shape and dtypes
2. Count NaN per column: decide: drop, fill, or interpolate
3. Check for duplicates
4. Run describe(): spot impossible min/max values
5. Standardise column names and units
6. Convert string-encoded numbers with pd.to_numeric(errors="coerce")
7. Assert final shape and NaN count before proceeding
Data Processing · DP02

Time Series Data

Sensor logs, QAR data, weather records, stock prices: all arrive as time series. Pandas has a full datetime index system built for this. Resampling, rolling windows, time-zone handling, gap detection.

Parsing and indexing datetime data
datetime_index.py
import pandas as pd

# Parse timestamps on load — always specify format if you know it
df = pd.read_csv("flight_log.csv",
                  parse_dates=["timestamp"],
                  date_format="%Y-%m-%dT%H:%M:%S.%f")

# Set as index — unlocks all time-series operations
df = df.set_index("timestamp").sort_index()

# Extract components
df["hour"]    = df.index.hour
df["date"]    = df.index.date
df["weekday"] = df.index.day_name()

# Slice by time range
morning   = df["08:00":"12:00"]          # today's morning
march     = df["2026-03"]                 # entire month
window    = df["2026-03-01":"2026-03-15"]

# Check for time gaps (critical for sensor data)
time_diff = df.index.to_series().diff()
gaps      = time_diff[time_diff > pd.Timedelta("1s")]
print(f"Gaps found: {len(gaps)}")
print(gaps)
Resampling: changing the time resolution
Your data arrives at 100 Hz. You want 1 Hz for analysis. Or daily averages from hourly data. Resample handles this in one line.
resample.py
# Downsample: 100 Hz → 1 Hz (mean over each 1-second window)
df_1hz = df.resample("1s").mean()

# Downsample to 1-minute max load factor
df_1min_max = df["load_factor"].resample("1min").max()

# Upsample and interpolate (fill gaps to uniform spacing)
df_uniform = df.resample("10ms").interpolate(method="time")

# Common resample strings:
# "10ms" = 10 milliseconds   "1s"  = 1 second
# "1min" = 1 minute          "1h"  = 1 hour
# "1D"   = 1 day             "1W"  = 1 week
# "1ME"  = 1 month end       "1YE" = 1 year end

# Multiple aggregations in one pass
summary = df.resample("1min").agg({
    "altitude_ft": ["mean", "max"],
    "airspeed_kts": ["mean", "std"],
    "load_factor": ["min", "max"]
})
Rolling windows: smoothing and trend detection
rolling.py
# Rolling mean — smooth out noise (window = number of samples)
df["alt_smooth"]  = df["altitude_ft"].rolling(window=50).mean()

# Rolling std — detect sudden changes / turbulence events
df["alt_std"]     = df["altitude_ft"].rolling(window=50).std()

# Rolling max — peak load in any 5-second window
df["peak_nz"]     = df["load_factor"].rolling(window=500).max()

# Exponential weighted mean (more weight on recent points)
df["ema_speed"]   = df["airspeed_kts"].ewm(span=20).mean()

# Rate of change — derivative (delta per sample)
df["d_altitude"]  = df["altitude_ft"].diff() / df["altitude_ft"].index.to_series().diff().dt.total_seconds()
# d_altitude is now rate of climb in ft/s

import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=True)
ax1.plot(df.index, df["altitude_ft"], alpha=0.3, label="Raw")
ax1.plot(df.index, df["alt_smooth"],  lw=2,    label="Smoothed")
ax2.plot(df.index, df["alt_std"], color="red",  label="Turbulence proxy")
for ax in (ax1, ax2):
    ax.legend(); ax.grid(True, alpha=0.3)
plt.tight_layout(); plt.show()
Time series checklist: Parse timestamps on load → set as index → sort → check for gaps → resample to uniform spacing → then apply rolling windows. Skipping any of these steps produces silently wrong results.
Data Processing · DP03

Reading Any File Format

Engineering data arrives in CSV, Excel, JSON, HDF5, Parquet, binary .dat, and occasionally formats invented by one vendor in 1994 and never updated since. This page covers them all.

CSV and Excel: the basics done right
csv_excel.py
import pandas as pd

# CSV with non-standard delimiters (semicolon, tab)
df = pd.read_csv("data.csv",  sep=";")
df = pd.read_csv("data.tsv",  sep="\t")

# CSV with metadata rows at top (common in lab exports)
df = pd.read_csv("test.csv", skiprows=6, header=0)

# CSV with multiple headers (units row below column names)
df     = pd.read_csv("test.csv", header=[0, 1])
df.columns = ["_".join(col).strip() for col in df.columns]

# Read specific columns only (large files)
df = pd.read_csv("big.csv", usecols=["time", "altitude", "speed"])

# Read in chunks (files too large for memory)
chunks = []
for chunk in pd.read_csv("huge.csv", chunksize=100_000):
    chunks.append(chunk[chunk["altitude"] > 10000])   # filter before concat
df = pd.concat(chunks, ignore_index=True)

# Excel — read all sheets
xl   = pd.ExcelFile("results.xlsx")
print(xl.sheet_names)
df   = xl.parse("Sheet1")
JSON and nested data
json_data.py
import pandas as pd
import json

# Simple flat JSON
df = pd.read_json("data.json")

# Nested JSON — flatten it
with open("flight.json") as f:
    data = json.load(f)

# If structure is {"flight": {"params": [...]}}
df = pd.json_normalize(data["flight"]["params"])

# Deeply nested: expand a column of dicts
# df["nested_col"] = [{"x": 1, "y": 2}, {"x": 3, "y": 4}, ...]
expanded = pd.json_normalize(df["nested_col"])
df = df.drop("nested_col", axis=1).join(expanded)
HDF5: large scientific datasets
HDF5 is the standard format for large simulation outputs, wind tunnel databases, and anything too large to fit in memory. Used by CFD solvers, NASA datasets, and most serious data archives.
hdf5.py: install: pip install h5py tables
import h5py
import numpy  as np
import pandas as pd

# Read HDF5 with h5py — inspect structure first
with h5py.File("simulation.h5", "r") as f:
    print(list(f.keys()))                   # top-level groups
    def print_structure(name, obj):
        print(name, type(obj).__name__)
    f.visititems(print_structure)            # full tree

    # Read a dataset
    pressure = f["flow_field/pressure"][:]  # [:] loads into numpy array
    coords   = f["mesh/coordinates"][:]

# Read HDF5 with pandas (simpler for tabular data)
df = pd.read_hdf("results.h5", key="/run_001/measurements")

# Write HDF5 (efficient storage for large DataFrames)
df.to_hdf("output.h5", key="/results", mode="w", complevel=6)
Parquet: the modern standard for large tabular data
parquet.py: install: pip install pyarrow
import pandas as pd

# Parquet: columnar format, 5-20x smaller than CSV, much faster to read
df = pd.read_parquet("flight_data.parquet")

# Read only specific columns (doesn't load rest from disk)
df = pd.read_parquet("data.parquet", columns=["time", "altitude"])

# Write (replace your CSVs with this for anything > 10 MB)
df.to_parquet("output.parquet", index=False, compression="snappy")

# Benchmark: 1 million row flight dataset
# CSV read:     ~4.2 seconds,  180 MB on disk
# Parquet read: ~0.3 seconds,   12 MB on disk
Binary .dat files from test rigs
DAS (Data Acquisition Systems) and older test equipment export binary files with fixed-width records. You need to know the byte structure: usually in the instrument manual.
binary_dat.py: reading fixed-width binary data
import struct
import numpy as np
import pandas as pd

# Example format: each record = timestamp (uint32) + 4 float32 channels
# Record size = 4 + 4*4 = 20 bytes
record_fmt  = "<I4f"       # little-endian: uint32 + 4×float32
record_size = struct.calcsize(record_fmt)

records = []
with open("test_run.dat", "rb") as f:
    while True:
        raw = f.read(record_size)
        if len(raw) < record_size:
            break
        t, ch1, ch2, ch3, ch4 = struct.unpack(record_fmt, raw)
        records.append((t, ch1, ch2, ch3, ch4))

df = pd.DataFrame(records,
     columns=["timestamp_ms", "pressure", "temperature", "flow", "voltage"])

# numpy fromfile — faster for uniform-type binary arrays
data = np.fromfile("raw_samples.dat", dtype=np.float32)
data = data.reshape(-1, 4)   # reshape into (n_samples, 4_channels)
Data Processing · DP04

Merging and Joining Datasets

Test data rarely arrives in one file. You have sensor readings from one system, test conditions from another, and post-processed results in a third. Combining them correctly is where most data pipelines break.

Merge: the pandas equivalent of SQL JOIN
merge.py
import pandas as pd

# Two DataFrames sharing a key column
# sensor_df: specimen_id, time, strain, load
# meta_df:   specimen_id, material, thickness, heat_treatment

# Inner join — only rows with matching IDs in both
combined = pd.merge(sensor_df, meta_df, on="specimen_id", how="inner")

# Left join — keep all sensor rows, attach meta where available
combined = pd.merge(sensor_df, meta_df, on="specimen_id", how="left")

# Merge on multiple keys
combined = pd.merge(df1, df2,
                     on=["flight_id", "leg_number"],
                     how="inner")

# Different column names in each DataFrame
combined = pd.merge(df1, df2,
                     left_on="test_ref", right_on="specimen_id",
                     how="left")

# Diagnose a failed merge immediately
print(f"df1 rows: {len(df1)},  df2 rows: {len(df2)},  merged: {len(combined)}")
# If merged << df1, keys don't match — inspect both
print(df1["specimen_id"].unique()[:5])
print(df2["specimen_id"].unique()[:5])
Joining on timestamps: the hardest case
Two sensors recording at different sample rates. You need to align them by time, not by row number. merge_asof handles this: it matches each row to the nearest timestamp in the other DataFrame.
time_join.py: align sensors at different sample rates
import pandas as pd

# gps_df:  1 Hz — timestamp, lat, lon, altitude
# imu_df: 100 Hz — timestamp, accel_x, accel_y, accel_z

# Both must be sorted by timestamp first
gps_df = gps_df.sort_values("timestamp")
imu_df = imu_df.sort_values("timestamp")

# Merge: for each GPS row, find the nearest IMU reading
combined = pd.merge_asof(
    imu_df,
    gps_df,
    on="timestamp",
    direction="nearest",     # or "backward", "forward"
    tolerance=pd.Timedelta("500ms")  # don't match if gap > 500ms
)

# Resample both to a common frequency first (cleaner approach)
gps_10hz = gps_df.set_index("timestamp").resample("100ms").interpolate()
imu_10hz = imu_df.set_index("timestamp").resample("100ms").mean()
combined = gps_10hz.join(imu_10hz, how="inner")
Concatenating multiple test runs
concat_runs.py: batch process a folder of test files
import pandas as pd
import glob, os

# Load every CSV in the folder and stack them
files = glob.glob("test_runs/*.csv")

dfs = []
for path in files:
    df     = pd.read_csv(path)
    df["source_file"] = os.path.basename(path)  # track origin
    dfs.append(df)

# Concat — ignore_index resets row numbers, keys labels each source
all_data = pd.concat(dfs, ignore_index=True)
print(f"Total rows: {len(all_data):,} from {len(files)} files")

# Concat with hierarchical index (keep file identity)
all_keyed = pd.concat(dfs, keys=[os.path.basename(f) for f in files])
# Access one file's data: all_keyed.loc["run_003.csv"]

# Verify: check for column mismatches across files
col_sets = [set(df.columns) for df in dfs]
if len(set.union(*col_sets)) != len(set.intersection(*col_sets)):
    print("WARNING: column mismatch across files")
    for s in col_sets:
        print(s)
Data Processing · DP05

Automating Reports

The last mile of any analysis is communication. Generating formatted reports programmatically: PDF, Word, or HTML: means the report updates automatically when the data changes. No manual copy-paste, no formatting drift.

HTML reports: fast, portable, no dependencies
An HTML file with embedded plots and tables opens in any browser, can be emailed, and needs no special software to view. Fastest option for internal reporting.
html_report.py: generate a self-contained analysis report
import pandas as pd
import matplotlib.pyplot as plt
import base64
from io import BytesIO
from datetime import datetime

def fig_to_b64(fig):
    """Convert matplotlib figure to base64 string for embedding in HTML."""
    buf = BytesIO()
    fig.savefig(buf, format="png", dpi=150, bbox_inches="tight")
    buf.seek(0)
    return base64.b64encode(buf.read()).decode()

# Load and process data
df      = pd.read_csv("test_results.csv")
summary = df.describe().round(3)

# Create a plot
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(df["time"], df["stress_MPa"], lw=1.5)
ax.set_xlabel("Time (s)"); ax.set_ylabel("Stress (MPa)")
ax.grid(True, alpha=0.3)
img_b64 = fig_to_b64(fig)
plt.close(fig)

# Build HTML
html = f"""<!DOCTYPE html><html><head>
<style>body{{font-family:sans-serif;max-width:960px;margin:40px auto;}}
table{{border-collapse:collapse;width:100%;}}
th,td{{border:1px solid #ddd;padding:8px;text-align:right;}}
th{{background:#f5f5f5;}}
</style></head><body>
<h1>Test Report</h1>
<p>Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}</p>
<h2>Summary Statistics</h2>
{summary.to_html()}
<h2>Stress vs Time</h2>
<img src="data:image/png;base64,{img_b64}" style="width:100%">
</body></html>"""

with open("report.html", "w") as f:
    f.write(html)
print("Saved report.html")
Word documents: for formal engineering reports
word_report.py: install: pip install python-docx
from docx import Document
from docx.shared import Inches, Pt
import pandas as pd

doc = Document()

# Title and metadata
doc.add_heading("Fatigue Test Report — Al 2024-T3", level=0)
doc.add_paragraph(f"Author: Noor Keshaish\nDate: {datetime.now().strftime('%d %b %Y')}")

# Section heading
doc.add_heading("Summary Statistics", level=1)

# Table from DataFrame
df      = pd.read_csv("results.csv")
summary = df.describe().reset_index()
table   = doc.add_table(rows=1, cols=len(summary.columns))
table.style = "Table Grid"

# Header row
for i, col in enumerate(summary.columns):
    table.rows[0].cells[i].text = str(col)

# Data rows
for _, row in summary.iterrows():
    cells = table.add_row().cells
    for i, val in enumerate(row):
        cells[i].text = str(round(val, 3) if isinstance(val, float) else val)

# Embed a figure
doc.add_heading("Stress-Strain Curve", level=1)
doc.add_picture("stress_strain.png", width=Inches(5.5))

doc.save("fatigue_report.docx")
print("Saved fatigue_report.docx")
PDF reports via matplotlib
pdf_report.py: multi-page PDF, no extra dependencies
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import pandas as pd

df = pd.read_csv("test_results.csv")

with PdfPages("test_report.pdf") as pdf:

    # Page 1: title page as text
    fig = plt.figure(figsize=(11, 8.5))
    fig.text(0.5, 0.6, "Test Analysis Report", ha="center", size=28)
    fig.text(0.5, 0.5, f"Generated: {datetime.now():%d %b %Y}", ha="center", size=14)
    pdf.savefig(fig); plt.close(fig)

    # Page 2: stress-strain plot
    fig, ax = plt.subplots(figsize=(11, 8.5))
    ax.plot(df["strain"], df["stress"], lw=2)
    ax.set_xlabel("Strain (%)"); ax.set_ylabel("Stress (MPa)")
    ax.set_title("Stress-Strain Curve"); ax.grid(True, alpha=0.3)
    pdf.savefig(fig); plt.close(fig)

    # Page 3: summary table rendered as a figure
    fig, ax = plt.subplots(figsize=(11, 4))
    ax.axis("off")
    summary = df.describe().round(2)
    tbl = ax.table(cellText=summary.values,
                  rowLabels=summary.index,
                  colLabels=summary.columns,
                  loc="center", cellLoc="right")
    tbl.auto_set_font_size(False); tbl.set_fontsize(9)
    pdf.savefig(fig, bbox_inches="tight"); plt.close(fig)

print("Saved test_report.pdf (3 pages)")
Visualisation · VIZ01

Plotly: Interactive Plots

Matplotlib produces static images. Plotly produces interactive HTML: hover for values, zoom, pan, toggle traces on/off, export to PNG. One function call to go from data to an interactive browser chart.

Line and scatter plots
plotly_basics.py: install: pip install plotly
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd

df = pd.read_csv("flight_data.csv")

# Line plot — interactive, hover shows values
fig = px.line(df, x="time", y="altitude_ft",
              title="Altitude vs Time",
              labels={"altitude_ft": "Altitude (ft)", "time": "Time (s)"})
fig.show()   # opens in browser

# Multiple traces on one plot
fig = go.Figure()
fig.add_trace(go.Scatter(x=df["time"], y=df["altitude_ft"],
                         name="Altitude", yaxis="y1"))
fig.add_trace(go.Scatter(x=df["time"], y=df["airspeed_kts"],
                         name="Airspeed", yaxis="y2"))
fig.update_layout(
    yaxis =dict(title="Altitude (ft)"),
    yaxis2=dict(title="Airspeed (kts)", overlaying="y", side="right")
)
fig.show()

# Scatter with colour mapped to a third variable
fig = px.scatter(df, x="airspeed_kts", y="load_factor",
                  color="altitude_ft", size="fuel_flow",
                  hover_data=["time", "phase"],
                  title="Load Factor vs Airspeed")
fig.show()
Saving and sharing interactive charts
save_plotly.py
# Save as standalone HTML — send to anyone, no Python needed
fig.write_html("flight_analysis.html")

# Save as static image (requires kaleido: pip install kaleido)
fig.write_image("plot.png", width=1200, height=600, scale=2)
fig.write_image("plot.pdf")   # vector PDF for reports

# Subplots
from plotly.subplots import make_subplots

fig = make_subplots(rows=3, cols=1, shared_xaxes=True,
                     subplot_titles=["Altitude", "Airspeed", "Load Factor"])
fig.add_trace(go.Scatter(x=df["time"], y=df["altitude_ft"],  name="Alt"),  row=1, col=1)
fig.add_trace(go.Scatter(x=df["time"], y=df["airspeed_kts"], name="IAS"),  row=2, col=1)
fig.add_trace(go.Scatter(x=df["time"], y=df["load_factor"],  name="Nz"),   row=3, col=1)
fig.update_layout(height=700, title="Flight Data Overview")
fig.show()
Plotly vs matplotlib: when to use each:
Use matplotlib for: publication figures, embedded plots in reports, precise layout control, LaTeX labels.
Use Plotly for: exploratory analysis where you want to inspect values, dashboards, anything you'll share as HTML, multi-axis time series with large datasets.
Visualisation · VIZ02

Seaborn: Statistical Plots

Seaborn wraps matplotlib with statistical plot types that would take 30 lines to build manually: distribution plots, correlation heatmaps, pair plots, regression overlays. Built for data that has categories and distributions.

Distribution plots: understand your data's shape
distributions.py: install: pip install seaborn
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("material_tests.csv")

# Histogram + KDE (kernel density estimate)
sns.histplot(df["UTS_MPa"], kde=True, bins=30)
plt.xlabel("UTS (MPa)"); plt.show()

# Compare distributions across groups
sns.histplot(data=df, x="UTS_MPa", hue="material", kde=True)
plt.show()

# Box plot — show median, IQR, and outliers
sns.boxplot(data=df, x="material", y="UTS_MPa")
plt.show()

# Violin plot — box + full distribution shape
sns.violinplot(data=df, x="material", y="UTS_MPa", inner="box")
plt.show()

# Strip + box combined (show individual points)
fig, ax = plt.subplots(figsize=(10, 5))
sns.boxplot(data=df, x="material", y="UTS_MPa", ax=ax, fliersize=0)
sns.stripplot(data=df, x="material", y="UTS_MPa", ax=ax,
              alpha=0.4, jitter=True, color="steelblue")
plt.show()
Correlation heatmaps
heatmap.py: find relationships between all variables at once
import seaborn as sns
import matplotlib.pyplot as plt

# Correlation matrix of all numeric columns
corr = df.select_dtypes("number").corr()

fig, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(corr,
            annot=True,          # show correlation values
            fmt=".2f",
            cmap="coolwarm",     # red=positive, blue=negative
            center=0,
            square=True,
            linewidths=0.5,
            ax=ax)
ax.set_title("Parameter Correlation Matrix")
plt.tight_layout(); plt.show()

# Mask the upper triangle (remove redundancy)
import numpy as np
mask = np.triu(np.ones_like(corr, dtype=bool))
sns.heatmap(corr, mask=mask, annot=True, fmt=".2f",
            cmap="coolwarm", center=0)
plt.show()
Pair plots and regression overlays
pairplot.py
# Pair plot: every variable plotted against every other
# Diagonal shows each variable's distribution
sns.pairplot(df[["UTS_MPa", "E_GPa", "elongation", "hardness"]],
             diag_kind="kde")
plt.show()

# Pair plot coloured by category
sns.pairplot(df, hue="material",
             vars=["UTS_MPa", "E_GPa", "elongation"])
plt.show()

# Scatter with regression line + confidence interval
sns.regplot(data=df, x="E_GPa", y="UTS_MPa",
            scatter_kws={"alpha": 0.5},
            line_kws={"color": "red"})
plt.show()

# lmplot — regression per category
sns.lmplot(data=df, x="E_GPa", y="UTS_MPa", hue="material",
           height=5, aspect=1.5)
plt.show()
Visualisation · VIZ03

Dash & Streamlit — Data Apps

A dashboard turns your analysis script into something non-coders can use. Upload a CSV, change a parameter, click a button: and the plots update. Streamlit is the fastest path to a working app. Dash gives you more control for production tools.

Streamlit: a data app in 20 lines
Install once: pip install streamlit. Run with: streamlit run app.py. Opens a live browser tab that reloads every time you save the file.
streamlit_app.py: upload CSV, plot it, filter it
import streamlit as st
import pandas    as pd
import plotly.express as px

st.title("Flight Data Analyser")

# File uploader — drag and drop CSV
uploaded = st.file_uploader("Upload a CSV file", type=["csv"])

if uploaded:
    df = pd.read_csv(uploaded)
    st.write(f"Loaded {len(df):,} rows")

    # Sidebar controls
    cols     = df.select_dtypes("number").columns.tolist()
    x_col    = st.sidebar.selectbox("X axis", cols)
    y_col    = st.sidebar.selectbox("Y axis", cols, index=1)
    plot_type= st.sidebar.radio("Plot type", ["Line", "Scatter"])

    # Filter by altitude range
    if "altitude_ft" in df.columns:
        min_alt, max_alt = int(df["altitude_ft"].min()), int(df["altitude_ft"].max())
        alt_range = st.slider("Altitude range (ft)", min_alt, max_alt, (min_alt, max_alt))
        df = df[df["altitude_ft"].between(*alt_range)]

    # Plot
    if plot_type == "Line":
        fig = px.line(df, x=x_col, y=y_col)
    else:
        fig = px.scatter(df, x=x_col, y=y_col, opacity=0.5)

    st.plotly_chart(fig, use_container_width=True)

    # Statistics
    st.subheader("Summary statistics")
    st.dataframe(df[[x_col, y_col]].describe().round(3))
Deploying Streamlit: share with anyone
deployment steps
# 1. Create requirements.txt
#    streamlit
#    pandas
#    plotly
#    numpy

# 2. Push to GitHub (public or private repo)

# 3. Go to share.streamlit.io
#    Connect GitHub → select repo → select app.py → Deploy
#    Free tier gives you a public URL: yourapp.streamlit.app

# Run locally
# streamlit run app.py
Dash: production-grade dashboards
Dash gives you full control over layout, callbacks, and multi-page apps. More setup than Streamlit but the right tool when you need precise UI control or are building something for wider use.
dash_app.py: install: pip install dash
from dash import Dash, dcc, html, Input, Output
import plotly.express as px
import pandas as pd

df  = pd.read_csv("flight_data.csv")
app = Dash(__name__)

app.layout = html.Div([
    html.H1("Flight Data Dashboard"),
    dcc.Dropdown(
        id="y-param",
        options=[{"label": c, "value": c} for c in df.select_dtypes("number").columns],
        value="altitude_ft"
    ),
    dcc.Graph(id="main-chart")
])

@app.callback(
    Output("main-chart", "figure"),
    Input("y-param", "value")
)
def update_chart(y_col):
    return px.line(df, x="time", y=y_col, title=y_col)

if __name__ == "__main__":
    app.run(debug=True)
# Visit: http://localhost:8050
Streamlit vs Dash: decision rule:
Streamlit: you need something working today, the UI can be simple, internal use.
Dash: you need precise layout control, complex interactivity, or it's going to external users.
Quick Reference

Python & C++ Cheat Sheet

Every essential concept from the course, side by side. Bookmark this page. Come back to it whenever you're coding and can't remember the exact syntax.

Concept Python C++ Notes
PRINTING
Print textprint("Hello")cout << "Hello" << endl;endl = new line
Print variableprint(x)cout << x << endl;
Print + variableprint(f"Hi {name}")cout << "Hi " << name;f-string vs chain
VARIABLES
Integerx = 10int x = 10;Whole numbers
Decimalx = 3.14double x = 3.14;Decimal numbers
Textx = "hello"string x = "hello";Always in quotes
True/Falsex = Truebool x = true;Lowercase in C++
MATHS
Adda + ba + bSame in both
Subtracta - ba - bSame in both
Multiplya * ba * bSame in both
Dividea / ba / bC++: use double for decimals
Remaindera % ba % bSame in both
Powera ** bpow(a, b)Need <cmath> in C++
Add to selfx += 5x += 5;Same in both
USER INPUT
Read textx = input("msg")cin >> x;
Read numberx = int(input("msg"))int x; cin >> x;Python: must convert
IF / ELSE
Ifif x > 5:if (x > 5) {Indentation vs braces
Else ifelif x == 5:} else if (x == 5) {elif vs else if
Elseelse:} else {
Equalsx == yx == y== not = (= assigns)
Not equalsx != yx != ySame in both
Andx > 0 and x < 10x > 0 && x < 10and vs &&
Orx < 0 or x > 10x < 0 || x > 10or vs ||
LOOPS
For (range)for i in range(5):for(int i=0; i<5; i++){0 to 4
For (range 1-5)for i in range(1,6):for(int i=1; i<=5; i++){1 to 5
For (list)for x in myList:for(type x : myArray){For-each
Whilewhile x < 10:while (x < 10) {Same logic
Incrementi += 1i++ or i += 1
FUNCTIONS
Define (no return)def greet():void greet() {void = no return
Define (with param)def greet(name):void greet(string name) {C++: must type params
Return a valuereturn x * 2return x * 2;Replace void with type
Call a functiongreet("Alice")greet("Alice");Same syntax
LISTS / ARRAYS
Createx = [1, 2, 3]int x[] = {1, 2, 3};
Access itemx[0]x[0]Index starts at 0
Change itemx[0] = 99x[0] = 99;Same
Lengthlen(x)Manual countNo built-in in C++ arrays
Add itemx.append(4)Use vector<int>C++ arrays are fixed size
MATRICES
Create 3x3m = [[1,2,3],[4,5,6],[7,8,9]]int m[3][3] = {{1,2,3},{4,5,6},{7,8,9}};
Access cellm[row][col]m[row][col]Same in both
FILE SETUP (C++ only)
For print/input(not needed)#include <iostream>Always at top
For strings(not needed)#include <string>Needed for string type
Namespace(not needed)using namespace std;Avoids writing std::
Main function(not needed)int main() { ... return 0; }Every C++ program needs this
Golden rules for beginners:
1. In Python: indentation matters. 4 spaces = belongs to this block.
2. In C++: every statement ends with a semicolon ;
3. In C++: every block of code goes inside { } braces.
4. Both languages: == means "compare", = means "assign". Don't mix them up.
5. Both languages: arrays start at index 0, not 1.