ML package map - numpy, pandas, sklearn, torch — step 1 of 7
The ML/data package map
When an AI writes data or ML code, it reaches for a handful of packages, and the imports tell you what kind of work is happening. You don't need to use them to read them — you need the map of what each is for.
- numpy — fast numeric arrays and math. The base layer most others build on.
- pandas — tables (dataframes): load a CSV, filter rows, group and aggregate. The spreadsheet of Python.
- scikit-learn — classic ML on tabular data: train a decision tree, a logistic regression, a clustering model. Not deep learning.
- torch (PyTorch) — deep learning: tensors, neural networks, training loops. Heavyweight; needed for custom nets.
- transformers — use a pretrained LLM or NLP model without training one yourself.
Run the editor for the map.
Reading the import line
import pandas as pd at the top of an AI-written script tells you "this is
table wrangling." from sklearn... import says "classic model on tabular
data." import torch says "neural nets — this is heavier." Recognizing the
package is recognizing the kind of program before you read a line of
logic.
Why a builder cares
Picking the wrong package is a classic AI mistake — reaching for torch to
filter a CSV (overkill; pandas does it in a line), or hand-rolling math
that numpy does faster. Knowing the map lets you sanity-check "is this the
right tool for the task?" You'll match tasks to packages next — no installs,
just the mapping.