Don’t Trust `pip install`: My Defensive Python Workflow
12 mins read

Don’t Trust `pip install`: My Defensive Python Workflow

I still remember the cold sweat I felt last Tuesday. I was rushing to meet a deadline, tired and over-caffeinated, when I almost typed pip install requesrs into my terminal. My finger stopped millimeters above the Enter key. It wasn’t divine intervention; it was just a habit I’ve forced upon myself after dealing with a nasty supply chain incident back in 2023. That single typo could have pulled down a malicious package, executed a post-install script, and scraped my SSH keys before I even realized I’d misspelled “requests”.

The Python ecosystem is incredible, but let’s be honest: PyPI is a wild place. We treat it like a curated app store, but it’s really more like a massive flea market where anyone can set up a stall. Most vendors are honest folks selling great tools, but a few are selling poison in pretty bottles. With the sheer volume of packages uploaded daily, relying on the repository maintainers to catch everything is a strategy for disaster.

I’ve spent the last year hardening my development workflow. I’m not talking about enterprise-grade air-gapped systems—just practical, everyday steps I take to ensure I don’t compromise my machine or my company’s infrastructure. Here is exactly how I handle PyPI safety in 2025, from the tools I use to the habits I can’t break.

The First Line of Defense: Modern Package Managers

For a long time, I stuck with standard pip and requirements.txt files. It felt simple. But simple doesn’t mean safe. The biggest shift in my workflow over the last eighteen months has been moving away from direct pip usage for project management. I’ve switched almost exclusively to the Uv installer.

Uv isn’t just about speed (though it is absurdly fast); it’s about control. When I add a dependency, I want to see exactly what it brings with it. The resolution logic in older tools sometimes felt like a black box. With Uv, I feel like I have a better grip on the dependency tree.

Here is how I initialize a safe environment now. I don’t just install things globally or into a loose venv. I lock everything down immediately.

# Initialize a new project with uv
uv init my-safe-project
cd my-safe-project

# Add a dependency, but don't just install it blindly
# I check the dry-run output first if I'm unsure about transitive deps
uv add requests --dry-run

# Actually add it
uv add requests

If I’m working on a project that requires a more holistic approach to Python version management and toolchains, I reach for the Rye manager. Rye wraps Uv under the hood but handles the Python installation itself. This isolates my project from whatever system Python mess exists on my machine (and prevents issues where system updates break my dev environment).

Why does this matter for safety? Because isolation is key. If a package tries to mess with site-packages, I want that damage contained to a throwaway environment managed by Rye or Uv, not my user-level Python install.

Vetting Before Installing

Before I even run that install command, I do a background check. It sounds tedious, but it takes thirty seconds and saves infinite headaches. I don’t blindly trust a README file on GitHub. I look at the PyPI page specifically.

I look for three things:

  1. Release Velocity: Is this version 0.0.1 released an hour ago? If so, I stay away.
  2. Download Counts: While not a perfect metric, a package with 5 downloads in the last month is suspicious if it claims to be a popular utility.
  3. Maintainer History: I check if the maintainer has other reputable packages.

I also use a script that wraps pypi-info to dump metadata to my terminal so I don’t have to leave the command line. If I see a suspicious homepage URL or an empty description, I investigate further.

hacker coding in dark room - High angle view at unrecognizable cyber security hacker wearing ...
hacker coding in dark room – High angle view at unrecognizable cyber security hacker wearing …

For critical dependencies, I actually download the wheel without installing it and inspect the contents. You’d be surprised how often “helper” packages contain obfuscated code. I use a simple Malware analysis approach: unzip the wheel and look for __init__.py files with base64 encoded strings or weird network calls.

import zipfile
import os

def inspect_wheel(wheel_path):
    with zipfile.ZipFile(wheel_path, 'r') as zip_ref:
        file_list = zip_ref.namelist()
        print(f"Files in {os.path.basename(wheel_path)}:")
        for file in file_list:
            print(f" - {file}")
            # Quick check for suspicious extensions
            if file.endswith(('.exe', '.sh', '.bat')):
                print(f"WARNING: Executable found: {file}")

# Usage: Download wheel first with pip download --no-deps <package>
# inspect_wheel("suspicious_package-1.0.0-py3-none-any.whl")

I’ve caught a few “typosquatters” this way—packages that look empty but have a massive setup.py that tries to exfiltrate env vars.

Locking Dependencies with Hashes

If you aren’t pinning your dependencies with hashes, you are playing Russian roulette. A version number isn’t enough. If an attacker compromises a maintainer’s account, they can replace version 1.2.3 with a malicious artifact. If you only pinned ==1.2.3, you pull the malware. If you pinned the hash, your install fails loudly.

Most modern tools do this by default now, which is a blessing. PDM manager and Uv both generate lockfiles with hashes automatically. I verify this constantly. I never deploy to production without a uv.lock or pdm.lock file that contains the sha256 sums of every single package.

Here is a snippet of what I look for in my lockfiles. If I see a package missing a hash, I reject the PR.

# Example snippet from a lockfile
[[package]]
name = "fastapi"
version = "0.115.0"
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
optional = false
python-versions = ">=3.8"
files = [
    { file = "fastapi-0.115.0-py3-none-any.whl", hash = "sha256:..." },
    { file = "fastapi-0.115.0.tar.gz", hash = "sha256:..." },
]

This ensures that the bytes I tested on my machine are the exact same bytes that land on the server. It mitigates the risk of a “silent swap” attack on PyPI.

Static Analysis as a Security Tool

We usually think of linters as style enforcers, but I use them for security. The Ruff linter has become indispensable in my workflow. It’s fast enough to run on every save, and it catches things that look “smelly.” While it won’t explicitly tell me “this is malware,” it flags unused imports, undefined variables, and ambiguous execution paths that often accompany malicious injections or just bad code that leads to vulnerabilities.

For example, if I audit a third-party library and see it using eval() or exec(), Ruff screams about it. I also enforce strict Type hints using MyPy updates. Why? Because dynamic typing is where a lot of injection attacks hide. If a function expects a string but accepts an object that executes code on __str__, that’s a vector. Strong typing makes data flow explicit.

I configure Ruff to be aggressive. I want to know about every shadow import and every blind except: block.

# pyproject.toml configuration for Ruff
[tool.ruff]
select = ["E", "F", "B", "S"] # S is for flake8-bandit (security)
ignore = []

[tool.ruff.lint.per-file-ignores]
"tests/*" = ["S101"] # Allow assert in tests

Notice the "S" selector? That enables flake8-bandit rules within Ruff. It specifically looks for common security issues like hardcoded passwords, weak cryptography, and shell injection risks. It’s zero-effort security scanning.

The Threat of Core Vulnerabilities

We often focus on malicious packages, but sometimes the threat comes from the language runtime itself or the interaction between Python and the underlying hardware. I’ve seen edge cases where memory safety issues in C extensions—or even CPython interpreters on specific architectures—cause data corruption. While rare, these “internal errors” remind me that the entire stack needs scrutiny.

padlock on laptop keyboard - Computer security concept unlocked padlock on laptop keyboard ...
padlock on laptop keyboard – Computer security concept unlocked padlock on laptop keyboard …

If a core string formatting operation fails or a buffer overflow occurs in a C-module, it doesn’t matter how safe my Python code is. This is why I prefer pure Python libraries over C-extensions when performance isn’t the absolute bottleneck. It reduces the surface area for memory safety bugs. When I do use heavy hitters like Polars dataframe or NumPy news-worthy releases, I pin them to versions I’ve stress-tested. I don’t upgrade these core numeric libraries the day they release; I wait a week to see if the community reports any segfaults or regressions.

Leveraging AI for Defense

I know, AI is everywhere, but hear me out. I use a Local LLM (usually a quantized Llama model running via Ollama) to review code I don’t understand. If I find a dependency that has a complex, obfuscated function, I paste it into my local model and ask, “What does this code actually do? Are there security risks?”

I never paste this code into a cloud-based model like ChatGPT because I don’t want to leak potentially proprietary (or malicious) code to a third party. Running it locally gives me a second pair of eyes. It’s surprisingly good at de-obfuscating variable names or explaining why a regex is vulnerable to ReDoS (Regular Expression Denial of Service).

Sandboxing: The Nuclear Option

Sometimes, I just have to run code I don’t fully trust. Maybe it’s a new FastAPI news scraper or a demo using LangChain updates that requires a dozen obscure dependencies. I never run this on my main OS.

I use Dev Containers for everything now. If a project gets compromised, the attacker gets root inside a Docker container that has no access to my host filesystem, no SSH keys, and no AWS credentials. It’s ephemeral. I delete the container, and the malware is gone.

padlock on laptop keyboard - Cybersecurity concept, padlock on laptop computer keyboard ...
padlock on laptop keyboard – Cybersecurity concept, padlock on laptop computer keyboard …

For the truly paranoid moments—like analyzing a potential malware sample—I spin up a disposable VM. It’s heavy, but it’s the only way to be sure. I’ve seen malware that detects if it’s in a Docker container and behaves differently, but VMs are harder to trick.

Continuous Monitoring

Safety isn’t a one-time setup. I run pip-audit in my CI pipeline. Every time I push code, it scans my environment for packages with known vulnerabilities (CVEs). If a vulnerability is discovered in a library I used six months ago, I want to know about it today.

# GitHub Actions snippet
- name: Audit dependencies
  run: |
    pip install pip-audit
    pip-audit

This breaks the build if I’m using an insecure version of Django or requests. It forces me to upgrade. It’s annoying, yes. But it’s less annoying than explaining a data breach to a client.

My Recommendation

If you take one thing away from my paranoia, let it be this: stop treating pip install as a harmless command. Treat it like downloading an executable from the internet, because that is exactly what it is. Switch to a lockfile-based workflow using Uv or PDM. Turn on hash checking. And for the love of code, don’t run unvetted dependencies on your production machines with root access.

The Python ecosystem is robust, but PyPI safety is ultimately your responsibility. The tools are there; we just have to be disciplined enough to use them every single time.

Leave a Reply

Your email address will not be published. Required fields are marked *