Developer Tools

surveilr: Files to SQL Databases, Own Your Data

Tired of data living in proprietary SaaS dashboards or ephemeral scripts? surveilr offers a radical return to local control, transforming your digital life into queryable SQLite databases. It’s your data, your SQL, forever.

A screenshot showing surveilr command-line interface output with SQL queries running on file system data.

Key Takeaways

  • surveilr transforms local files, emails, and API data into queryable SQLite databases.
  • It emphasizes data ownership and avoids cloud lock-in with a local-first approach.
  • The tool enables powerful cross-domain queries, merging different data types for deeper insights.

Forget the glossy dashboards and subscription fees. surveilr wants you to own your data. Permanently.

What does this mean for regular folks, not just corporate drones? It means your audit trail doesn’t vanish when a vendor changes their pricing model or, worse, shuts down. It means your personal files, your emails, that mountain of project documentation – they all become standard SQLite databases. Locally. Offline. Yours.

Your Audit Trail Should Be a Database, Not a Rental Agreement

This is the core pitch, and it’s a good one. In a world obsessed with cloud lock-in and subscription services, surveilr is waving a very big, very SQL-shaped flag of defiance. It collapses the messy world of file indexing, data transformation, and API ingestion into a single, local primitive: SQLite.

Think about it. We’re drowning in data. And where does it go? Into a SaaS platform you can’t export properly, or a series of scripts that are impossible to maintain a year from now. CSV files that lose all context. surveilr offers an alternative: a .db file. A file that’s inspectable, durable, interoperable, portable, and, most importantly, permanent. It’s the antithesis of vendor lock-in.

Your audit trail should be a database you own, not a SaaS UI you rent.

This isn’t hyperbole. It’s a statement of intent. And the implementation? Pretty slick.

Install it via Homebrew (because of course it is). Initialize a database. Tell it to scan your Documents folder. Then, open the SQL shell. It’s that simple. You can then run queries that feel like magic. Find PDFs modified after a certain date. Track how many times a file has been renamed. Locate massive orphaned files.

The Power of Cross-Domain Queries

But the real juice comes when you start mashing up different data sources. The examples they provide are fascinatingly practical. Imagine joining your file modification times with your email timestamps. Or correlating GitHub commits with specific incident tickets. This isn’t just about organizing your data; it’s about unlocking insights you couldn’t possibly get from siloed SaaS tools.

SELECT f.file_path, e.subject, e.date, f.last_modified FROM files f JOIN emails e ON e.subject LIKE ‘%’ || f.file_basename || ‘%’ WHERE f.last_modified < e.date ORDER BY e.date DESC;

This query—finding files modified before the email referencing them was sent—is the kind of forensic, detective work that becomes readily available. It’s a peek into a future where your local machine is a powerful data analysis hub, not just a content consumption device.

Is This Just Another ETL Tool?

Not quite. ETL (Extract, Transform, Load) is usually a corporate buzzword for complex, often cloud-based systems. surveilr is simpler, more direct. It’s an ingestion layer. It speaks SQL. It outputs SQLite. It doesn’t pretend to be a platform or a dashboard. It’s a tool. A very good tool.

It use Singer taps for API ingestion, meaning it can pull from a vast array of sources. GitHub, Jira, Salesforce – the usual suspects. It also handles content transformation: CSV to SQL, HTML to JSON, Markdown to queryable data. And email ingestion? IMAP support means your Gmail or Outlook data is fair game for SQL queries.

Why Does This Matter for Developers?

Developers have always lived in the trenches of scattered data. Whether it’s log files, API outputs, or configuration files, developers are constantly wrangling information. surveilr offers a path to bring order to that chaos. Instead of writing ad-hoc scripts for every log analysis or API data pull, developers can build persistent, queryable databases. This means faster debugging, easier trend analysis, and better documentation of operational data. It’s a significant step towards giving developers more control over the data they generate and consume.

Furthermore, the reliance on standard SQLite means excellent interoperability. Datasette, DuckDB, pandas, Grafana – all can tap into the data surveilr provides. This isn’t a closed system. It’s an open pathway to data ownership.

The Corporate Spin vs. The Real Deal

Now, let’s talk about the elephant in the room. Companies love to tell you their dashboard is the only way to see your data. They build these shiny interfaces, and you’re stuck within their confines. surveilr is the antithesis of this. It’s not about flashy UIs; it’s about the raw power of SQL on data you control. This is a return to principles that open source was built on: transparency, control, and permanence.

My only quibble? The name. ‘surveilr’ sounds a tad ominous. But then again, maybe that’s the point. We should be surveilling our own data. Keeping a close eye on it. Ensuring it remains in our possession.

For anyone who’s ever lost data to a forgotten subscription, wrestled with unmaintainable scripts, or just wished their files were more… queryable, surveilr is a breath of fresh air. It’s pragmatic, powerful, and refreshingly local-first.


🧬 Related Insights

Frequently Asked Questions

What does surveilr actually do?

surveilr turns your files, emails, and API data into standard SQLite databases that you can query using SQL on your local machine.

Is this a replacement for my cloud database?

No, surveilr is designed for local data and operational data that you want to control and query indefinitely. It complements, rather than replaces, cloud databases for active applications.

Do I need to be a SQL expert to use it?

While advanced SQL will unlock the most power, basic queries for finding specific information are relatively straightforward, and the tool aims to make data accessible. The examples provided show both simple and complex query possibilities.

Written by
Open Source Beat Editorial Team

Curated insights, explainers, and analysis from the editorial team.

Frequently asked questions

What does surveilr actually do?
surveilr turns your files, emails, and API data into standard SQLite databases that you can query using SQL on your local machine.
Is this a replacement for my cloud database?
No, surveilr is designed for local data and operational data that you want to control and query indefinitely. It complements, rather than replaces, cloud databases for active applications.
Do I need to be a SQL expert to use it?
While advanced SQL will unlock the most power, basic queries for finding specific information are relatively straightforward, and the tool aims to make data accessible. The examples provided show both simple and complex query possibilities.

Worth sharing?

Get the best Open Source stories of the week in your inbox — no noise, no spam.

Originally reported by Dev.to

Stay in the loop

The week's most important stories from Open Source Beat, delivered once a week.