Security & Privacy

CVE Monitoring Script: NIST Data Filtered by Python

The sheer volume of daily CVEs from NIST overwhelms many organizations. Fortunately, a lean 50-line Python script demonstrates how to cut through the noise and deliver precisely the alerts that matter.

Diagram showing data flow from NIST CVE feeds through a Python script to Telegram alerts.

Key Takeaways

  • A 50-line Python script automates CVE monitoring by filtering NIST data.
  • RSS feeds are preferred over the NVD API for daily digests due to simplicity and no authentication requirements.
  • Deduplication using a persistent seen file is critical to prevent alert fatigue.
  • Relying solely on CVSS scores for filtering is a trap; EPSS and keyword matching provide a more nuanced approach.
  • The script delivers relevant alerts via Telegram, enabling targeted security awareness.

For years, the security community has grappled with the relentless deluge of Common Vulnerabilities and Exposures (CVEs) published daily by NIST. The expectation has always been a comprehensive, but often unmanageable, stream of data. Now, the market dynamics of threat intelligence are shifting, pushing towards hyper-relevance and actionable insights rather than raw data dumps. This change is precisely what a remarkably concise Python script, clocking in at just 50 lines, aims to achieve.

Forget sifting through hundreds of daily entries from the National Vulnerability Database (NVD). This isn’t about raw access; it’s about intelligent ingestion. The problem isn’t a lack of CVE data, but rather the overwhelming signal-to-noise ratio that makes missing critical alerts a genuine possibility. The typical security team is swamped, and the current approach leaves them vulnerable to the very threats they’re supposed to be monitoring.

So, how is this problem being tackled? Primarily, it boils down to two programmatic avenues for consuming CVE data: the NVD REST API and RSS feeds. The NVD API offers depth, providing structured records with CVSS scores and CPE references. However, it comes with pagination hurdles and rate limits that can quickly bog down a naive implementation. Free API keys help, but the fundamental complexity remains.

RSS feeds, on the other hand, offer a simpler path for day-to-day monitoring. Aggregators like NVD, CERT, and CVE.org provide feeds that update regularly, are easily parsed with libraries like feedparser, and crucially, require no authentication. The compromise? Less structured data, often missing precise CVSS scores but delivering the essential title, description, and publication date.

For the specific use case of a daily digest delivered via Telegram, with targeted keyword filtering, RSS with keyword matching emerges as the pragmatic choice. The NVD API, in this setup, is reserved for deeper dives into specific CVEs when a score lookup is truly necessary.

The Ingenious Script

The core logic is elegantly simple, utilizing a cron job for regular execution. Here’s the script in its entirety:

import feedparser
import requests
import hashlib
import json
from pathlib import Path
from datetime import datetime, timezone, timedelta

TELEGRAM_BOT_TOKEN = "YOUR_BOT_TOKEN"
TELEGRAM_CHAT_ID = "YOUR_CHAT_ID"
SEEN_FILE = Path("/tmp/.cve-seen.json")

KEYWORDS = [
    "fortinet", "fortigate", "fortios",
    "sonicwall", "palo alto", "panos",
    "pfSense", "opnsense",
    "windows server", "active directory", "kerberos",
    "vmware esxi", "vcenter",
    "cisco ios", "cisco asa",
]

CVE_FEEDS = [
    "https://feeds.feedburner.com/nvd-cve/rss",
    "https://www.cert.ssi.gouv.fr/alerte/feed/", # ANSSI alerts
]

def load_seen() -> set:
    if SEEN_FILE.exists():
        return set(json.loads(SEEN_FILE.read_text()))
    return set()

def save_seen(seen: set) -> None:
    SEEN_FILE.write_text(json.dumps(list(seen)))

def is_recent(entry) -> bool:
    published = entry.get("published_parsed")
    if not published:
        return True # include if date unknown
    pub_dt = datetime(*published[:6], tzinfo=timezone.utc)
    return datetime.now(timezone.utc) - pub_dt < timedelta(hours=26)

def matches_keywords(entry) -> bool:
    text = (entry.get("title", "") + " " + entry.get("summary", "")).lower()
    return any(kw.lower() in text for kw in KEYWORDS)

def entry_id(entry) -> str:
    return hashlib.md5(entry.get("link", entry.get("title", "")).encode()).hexdigest()

def send_telegram(message: str) -> None:
    url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
    requests.post(url, json={
        "chat_id": TELEGRAM_CHAT_ID,
        "text": message,
        "parse_mode": "Markdown",
        "disable_web_page_preview": True,
    }, timeout=10)

def main():
    seen = load_seen()
    alerts = []
    for feed_url in CVE_FEEDS:
        feed = feedparser.parse(feed_url)
        for entry in feed.entries:
            eid = entry_id(entry)
            if eid in seen:
                continue
            if not is_recent(entry):
                continue
            if not matches_keywords(entry):
                continue
            seen.add(eid)
            title = entry.get("title", "No title")
            link = entry.get("link", "")
            summary = entry.get("summary", "")[:200]
            alerts.append(f"*{title}*\n{summary}...\n{link}")
    save_seen(seen)
    if alerts:
        header = f"CVE Alert — {datetime.now().strftime('%Y-%m-%d')} ({len(alerts)} new)\n\n"
        send_telegram(header + "\n\n---\n\n".join(alerts[:10])) # cap at 10
    else:
        pass # silence is fine — no news is good news

if __name__ == "__main__":
    main()

This script is scheduled to run every six hours via cron:

0 */6 * * * /usr/bin/python3 /opt/scripts/cve-monitor.py

Why Deduplication is King

The most insidious trap in CVE monitoring is duplicate alerts. When the same vulnerability pops up across multiple feeds, at different times, the result is notification fatigue. Users start ignoring the channel, rendering the entire system useless. The SEEN_FILE JSON, persisting CVE IDs across script executions, is the ingenious solution here. It’s not just a feature; it’s the bedrock of a functional system.

The CVSS Score Fallacy

Filtering solely by CVSS score, particularly higher thresholds like 7.0, is a dangerous oversimplification. The author learned this the hard way, missing an actively exploited FortiOS vulnerability with a CVSS score of 5.9. This highlights the crucial distinction between theoretical severity (CVSS) and real-world exploitation probability (EPSS). Relying on either metric in isolation is a losing game. The current strategy—matching keywords and then letting a human adjudicate priority—is, frankly, the only sensible approach in a world where threat actors are far more agile than compliance checkboxes.

This approach isn’t just about saving time; it’s about re-centering security focus on actionable threats rather than the overwhelming noise of every single disclosed vulnerability. It’s a stark reminder that sometimes, the most effective solutions are the simplest, provided they are implemented with a deep understanding of the problem’s practical realities.


🧬 Related Insights

Written by
Open Source Beat Editorial Team

Curated insights, explainers, and analysis from the editorial team.

Worth sharing?

Get the best Open Source stories of the week in your inbox — no noise, no spam.

Originally reported by Dev.to

Stay in the loop

The week's most important stories from Open Source Beat, delivered once a week.