Learn data analysis to answer real questions in four months
Four months of daily work — about an hour a day on one book, free Kaggle micro-courses, and one personal dataset — gets a beginner from staring at spreadsheets to writing pandas notebooks that answer real questions. Roughly 120 hours total. Not a data scientist. A person who can wrangle data.
4 months · ~120 hours · publish a Jupyter notebook that answers a real question with charts
1.Python for Data Analysis, 3rd edition (Wes McKinney)
Wes McKinney wrote pandas, then he wrote the book on it, and the third edition is now free online with the same material as the print version. Read chapters 1–10 and 13. Type along in Jupyter. The chapters on data wrangling (chapter 8) and group-by (chapter 10) are the heart of working data analysis — almost everything else flows from these two. The free online edition is searchable and gets corrected continuously.
Free online; $50 paperback from O'Reilly
Python for Data Analysis →2.Kaggle Learn — Pandas, Data Visualization, Intro to ML
Kaggle Learn's free micro-courses are built for exactly this gap: short, applied, browser-based notebooks that drill the patterns McKinney introduces. Do Pandas (4 hours), Data Visualization with seaborn (4 hours), and Intro to Machine Learning (3 hours). The exercises run in Kaggle's sandbox so there is no setup friction. Skip the SQL course — go to faculté/sql for that path.
Free
Kaggle Learn →3.Analyze a dataset from your own life
Export your bank statements as CSV. Or your Strava history. Or your inbox metadata. Or your local government's open data portal. Pick a real question — "how does my spending change in winter?", "is my pace getting better on hilly routes?" — and answer it in a Jupyter notebook with at least three charts. Publish it as a public GitHub repo with the notebook and a one-paragraph README. This is the artefact that proves you can do data analysis.
Free
Install Jupyter →If this doesn't fit you
If you work in a business context and your colleagues use Excel and Tableau rather than Python, this path is the wrong tool. Learn advanced Excel (pivot tables, INDEX/MATCH, Power Query) and add Tableau Public. You will be more useful at work and you can come back to Python when you outgrow them. Don't pretend a tool stack you can't share with your team.
Why this path
Most data analysis courses teach pandas in the abstract and never show how an analyst actually thinks about a messy CSV. McKinney's book is written by the person who solved that problem and keeps solving it; Kaggle's micro-courses give you reps in a low-friction sandbox; the personal dataset closes the gap. Skipping the personal project is the most common mistake. Without one, you can follow tutorials forever and never write a notebook from a blank page.