Documentation — Duck Data Master

// GETTING STARTED

Quick Start

Duck Data Master Guru is self-service — one command deploys the full analytics stack to your cloud instance. You can go from signup to querying your data in under five minutes.

1

Sign up for a 3-day free trial Go to signup.duckdatamaster.guru. Enter your email and password. No credit card required. You get 3 days of full Guru access.

2

Deploy your cloud analytics instance In the Guru Portal, click Generate install command. Run the one-line command on any GCP, AWS, Azure, or VPS instance running Ubuntu 22.04+. The full analytics stack installs in under 5 minutes.

3

Ask a question in plain English Type your question in the AI bar at the top. "What were my top 10 customers by revenue last quarter?" The AI writes the SQL. Review it, hit Run, and see results in seconds.

Start Free Trial — No Card Required →

Sign Up & Trial

The 3-day free trial gives you full Guru access from minute one. No credit card is required to start.

Starting your trial

Go to signup.duckdatamaster.guru
Create an account with your email and password
Select Duck Data Master Guru — $99/mo platform fee
Enter payment details. You will not be charged until the 3-day trial ends
Cancel any time before the trial ends and pay nothing

Trial converts automatically. If you don't cancel before day 3, your card is charged for the first month. You can cancel at any time from the account screen inside the app.

// DASHBOARD TABS

Ingest Tab

The Ingest tab is the first tab — step one of every pipeline. Three data entry points are available inline, no modals:

File Upload — drag-and-drop or click to browse. Loads directly into the analytics engine on your instance.
Local Path — enter an absolute file path on your instance (e.g. /data/orders.parquet).
GCS Bucket Browser — browse your dedicated GCS bucket, create folders, delete files/folders, and load selected files directly. See GCS File Manager below.

Every file loaded becomes a named table queryable immediately in the Query, Transform, Profile, Join, ML, and Fuzzy tabs.

GCS Bucket File Manager Guru

Your Guru instance includes a dedicated GCS bucket, auto-connected via the instance's service account. The Ingest tab shows a full file manager for that bucket — no Cloud Console required.

Navigation

Breadcrumb path at the top — click any segment to jump to it
../ button — go up one folder level
Click any folder row to navigate into it

Operations

📁 New Folder — creates a folder at the current path (uses a .keep placeholder blob in GCS)
🗑 Delete file — per-row delete button on every file row
🗑 Delete folder — deletes the folder and all blobs inside it recursively
☑ Select + Load Selected → — check any files and load them into the analytics engine in one click

Folders in GCS are prefixes, not real objects. The dashboard creates a folder/.keep placeholder to make the folder visible. Deleting a folder removes all blobs with that prefix, including the .keep file.

Extract Tab Guru

The Extract tab goes beyond standard file upload — connect to open table formats, run geospatial queries, and read cloud data lakes directly without pre-loading files into memory.

Spatial Analytics

The spatial extension enables full geospatial SQL with 50+ ST_ functions:

SELECT name, ST_Distance(location, ST_Point(-112.07, 33.45)) AS dist_degrees
FROM stores
WHERE ST_Distance(location, ST_Point(-112.07, 33.45)) < 0.5
ORDER BY dist_degrees;

Supports GeoJSON, WKT, and coordinate point geometry types.

Delta Lake & Apache Iceberg

Read open table format data lakes directly from GCS or S3 — no Spark cluster required:

-- Read a Delta table from GCS
SELECT * FROM delta_scan('gs://your-bucket/delta-table/');

-- Read an Iceberg table
SELECT * FROM iceberg_scan('gs://your-bucket/iceberg-table/');

Direct Cloud Query (httpfs)

Query Parquet, CSV, or JSON files on S3/GCS/Azure without loading them into memory first — the engine streams what it needs:

-- Query a remote Parquet file without downloading it
SELECT region, SUM(revenue)
FROM read_parquet('gs://your-bucket/sales/*.parquet')
GROUP BY region;

Nothing lands permanently on disk. httpfs streams data directly for each query — ideal for large data lakes where you only need a subset of files per session.

// LOADING DATA

File Upload

The Ingest tab accepts files via drag-and-drop or click to browse. Load multiple files in one session — each becomes a separate named table.

How it works

When you drop a file, it loads directly into the analytics engine running on your cloud instance. Your file never leaves your cloud environment. No upload to Duck Data Master servers, no data movement, no shared compute.

Table naming

The table name is derived from the filename. sales_2024.csv becomes the table sales_2024. You can query it immediately after loading.

-- After loading sales_2024.csv:
SELECT region, SUM(revenue) AS total
FROM sales_2024
GROUP BY region
ORDER BY total DESC;

Loading multiple files

Drop multiple files at once or load them one at a time. Each becomes a separate table. You can then JOIN across tables in a single query.

-- Loaded orders.csv and customers.csv — join them:
SELECT c.name, COUNT(o.id) AS order_count
FROM orders o
JOIN customers c ON o.customer_id = c.id
GROUP BY c.name
ORDER BY order_count DESC;

Load from URL

Paste any public HTTPS URL pointing to a supported file into the URL box below the drop zone and click Load. Give the table a name first.

-- Example public Parquet file
https://data.cityofchicago.org/api/views/wrvz-psew/rows.parquet

The file is fetched and loaded directly into the analytics engine. It does not pass through our backend.

Cloud Storage Connectors Guru

Connect to files stored in Amazon S3, Google Cloud Storage, or Azure Blob Storage from the Extract tab (Direct Cloud Query section) or the Ingest tab (GCS bucket browser). Enter your credentials in the Extract tab to authenticate against any external bucket or S3 path.

Your credentials are never stored. They are used to authenticate your cloud instance directly against your bucket. The file loads from your cloud storage into your cloud instance. Duck Data Master servers never touch your data.

Amazon S3

Access Key ID — your AWS IAM access key
Secret Access Key — your AWS IAM secret
Region — e.g. us-east-1 (defaults to us-east-1 if blank)
Path — s3://your-bucket/path/to/file.parquet
Endpoint — optional, only needed for S3-compatible storage like MinIO

Google Cloud Storage

Service Account JSON — paste the full contents of your .json key file
Path — gs://your-bucket/path/to/file.parquet

Azure Blob Storage

Account Name — your storage account name
Account Key — your storage account access key
Path — az://your-container/path/to/file.parquet

Supported File Formats

Format	Extension	Notes
CSV	.csv	Auto-detects delimiter, types, encoding
TSV	.tsv, .txt	Tab-separated values
Excel	.xlsx	First sheet loaded by default
JSON	.json	Array of objects or newline-delimited
NDJSON	.ndjson, .jsonl	Newline-delimited JSON
Parquet	.parquet	Column-oriented, ideal for large datasets
Apache Arrow	.arrow, .ipc	Zero-copy columnar format

// QUERY TAB — SQL MODE

Query Tab — SQL NL Mode

The AI bar at the top of the dashboard converts plain-English questions into SQL queries. Select SQL mode, type your question, press Enter, and the generated SQL runs immediately — results appear in the Query tab as a paginated table.

How it works

Your question and the schema of your loaded tables are sent to our AI backend (Gemini on Vertex AI). The AI returns SQL only — your actual data rows are never sent to the AI.

Example questions

-- These all work as plain-English AI questions in SQL mode:
"What were my top 10 customers by revenue last quarter?"
"Show me month-over-month growth in orders for 2024"
"Find all rows where the status is null or empty"
"Calculate the 90th percentile of order value by region"
"Which product categories have declining sales over the last 3 months?"

Results panel

Results appear in the Query tab with row count and elapsed time. From there you can: Save as Table (registers result as a named session table), Export CSV, or push to → Notebook to inject the result as a pandas DataFrame cell.

AI query limits

Guru subscribers get 2,000 AI queries per day (resets at midnight UTC). The counter appears in the top-right RAM/usage bar of the dashboard.

// QUERY TAB — PYTHON MODE

Query Tab — Python NL Mode

Switch the top bar to Python mode to ask questions that return executable Python — pandas transformations, matplotlib charts, statistical summaries, and more. The AI generates a complete Python script, runs it server-side on your instance, and streams stdout and figures back to the Query tab.

How it works

Your question, the schema of your loaded tables, and any SQL you've already run are sent to the AI. The AI returns a Python script using the pre-loaded con (analytics connection) object. The script runs in an isolated exec environment on your VM — output, print statements, and matplotlib/plotly figures all render inline.

Pre-loaded in every Python execution

con          # analytics connection — query any loaded table
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

Example questions

# These all work as plain-English questions in Python mode:
"Plot a histogram of trip distances grouped by passenger count"
"Show a correlation heatmap for all numeric columns"
"Give me a statistical summary of the fare_amount column"
"Plot monthly revenue trend with a 30-day rolling average"
"Rank the top 20 pickup zones by total tips"

Sending results to Notebook

The → Notebook button in the Query tab injects the generated Python code directly into a new Notebook cell — ready to edit, extend, and re-run with full Jupyter-style keyboard shortcuts.

AI query limits

Python NL queries share the same 2,000/day Guru quota as SQL NL queries.

Chat Tab — Duck Master AI

Duck Master is a conversational AI assistant in the Chat tab (last tab, 💬). Unlike the NL bar at the top (which generates and runs one query at a time), Duck Master maintains a full conversation — ask follow-up questions, request data profiles, get cleaning suggestions, and iterate.

How to use it

Click the Chat tab. Type your question. Duck Master responds in conversation, referencing your loaded tables by name. Chat history is preserved in localStorage for the session.

Pipeline walkthrough

Ask Duck Master: "walk me through a data pipeline step by step" — he'll guide you tab by tab: Ingest → Extract → Query → Transform → Profile → Join → ML Score → Fuzzy → Export → PQC Sign → Notebook.

What Duck Master can do

Profile a table — identify column types, null %, unique counts, value ranges
Spot data quality issues — mixed types, suspicious nulls, case inconsistencies
Write transform SQL — CREATE OR REPLACE TABLE with CAST, TRIM, REGEXP_REPLACE
Answer questions about your specific data — references actual column names and counts
Generate analysis queries — pivot tables, time series, ranked lists, cohort analysis
Inject answers into Notebook — click "→ Notebook" to send any response directly into a new notebook cell

Load a file before asking about your data. Once a table is loaded, Duck Master sees its schema and row count and gives much more specific, actionable responses.

// SQL & RESULTS

Transform Tab — SQL Editor

The Transform tab is a full CodeMirror SQL editor with syntax highlighting. Write any SQL query or multi-statement script, then click Run or press Ctrl+Enter.

Keyboard shortcuts

Ctrl+Enter — run the current query
Tab — insert 2-space indent

Results appear inline below the editor. Use CREATE OR REPLACE TABLE cleaned AS SELECT ... to save a transformed table for use in other tabs.

SQL Reference

Duck Data Master uses a full analytical SQL engine — not a subset. The following features are all supported:

Window functions

SELECT
  order_date,
  revenue,
  SUM(revenue) OVER (ORDER BY order_date) AS running_total,
  LAG(revenue) OVER (ORDER BY order_date)  AS prev_revenue,
  RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS rank_in_region
FROM sales;

PIVOT

PIVOT sales
ON month
USING SUM(revenue)
GROUP BY region;

CTEs

WITH monthly AS (
  SELECT DATE_TRUNC('month', order_date) AS month, SUM(revenue) AS total
  FROM orders
  GROUP BY 1
)
SELECT month, total,
  total - LAG(total) OVER (ORDER BY month) AS mom_change
FROM monthly;

Other capabilities

Joins — INNER, LEFT, RIGHT, FULL, CROSS, ASOF
Aggregates — SUM, COUNT, AVG, MEDIAN, PERCENTILE_CONT, STDDEV, VARIANCE
Regex — REGEXP_MATCHES, REGEXP_REPLACE, REGEXP_EXTRACT
Time series — DATE_TRUNC, DATE_DIFF, AT TIME ZONE
Nested data — LIST, STRUCT, UNNEST
Data profiling — SUMMARIZE table_name

Export Tab

The Export tab handles downloads and cloud write-back. Select any loaded table, choose a format, and download or push to cloud storage.

Download formats

CSV — opens in Excel, Sheets, or any downstream tool
Parquet — columnar format ideal for large datasets and pipelines
JSON — records-oriented, ready for APIs or document stores

Files are generated on your cloud instance and transferred directly to your browser — they never pass through Duck Data Master servers.

Write back to cloud storage

Use the GCS Write-Back section in the Export tab to push any table directly to your GCS bucket — or use COPY TO in the Transform tab for full control:

COPY (SELECT * FROM orders WHERE year = 2025)
  TO 'gs://your-bucket/orders/2025/' (FORMAT PARQUET, PARTITION_BY month)

PQC-signed exports

Check Sign with ML-DSA-65 before exporting to produce a tamper-evident .sig file alongside the data file. See Post-Quantum Signing for full details.

Fuzzy Match Tab Guru

The Fuzzy tab finds approximate string matches across two tables using Jaro-Winkler similarity — without exact-match SQL. Practical for deduplication, entity resolution, and joining messy real-world data where names aren't standardized.

When to use it

"Acme Corp" vs "ACME Corporation" vs "Acme Corp." — same company, different strings
Customer name matching across two CRM exports
Product name deduplication in a merged catalog
Address matching without exact zip/street agreement

Workflow

Select Table A and the string column to match from
Select Table B and the string column to match against
Set the similarity threshold (0.0–1.0 — default 0.85)
Click Run Fuzzy Match
Results show matched pairs with their Jaro-Winkler score — export or save as a table

Threshold guidance: 0.95+ for near-exact matches. 0.85–0.94 for typical name variants. 0.75–0.84 for loose matching (more false positives). Review results and adjust.

AI Notebook Tab Guru

The Notebook tab is a full code + markdown cell environment with Jupyter-compatible keyboard shortcuts. Use it for custom analysis scripts, Python pipelines, and annotation — all running on your dedicated instance.

Jupyter keyboard shortcuts

Shift+Enter — run cell and move to next
Ctrl+Enter — run cell in place
Esc — enter command mode (amber border)
Enter — enter edit mode (green border)
A — insert cell above (command mode)
B — insert cell below (command mode)
M — convert cell to Markdown
Y — convert cell to Code
D, D — delete cell (double-tap D within 500ms)
↑ / ↓ — navigate cells in command mode

Cell features

Auto-growing cells — CodeMirror expands as you type, no scroll needed
Collapsible cells — click ▼ in the left gutter to fold long outputs
Cell number [N] shown in the left gutter
Amber left border = selected in command mode · Green border = edit mode

AI cell assist

Use the ✦ Suggest button (top-right of toolbar) to have Duck Master write a cell based on your description. The "→ Notebook" button in the Query tab and Chat tab injects results directly into a new cell.

Save & export

Click Save .ipynb to download the notebook as a standard Jupyter .ipynb file. It can be reopened in JupyterLab on your instance or any Jupyter environment.

Join Builder Guru

Build cross-table joins without writing SQL. The Join tab presents a visual form:

Select Table A and Table B from any loaded tables
Choose the join key column from each table
Choose join type: INNER, LEFT, RIGHT, or FULL OUTER
Optionally name the result — it saves as a new table you can query immediately

The generated SQL is shown before execution. Results display inline and are saved as last_result for export or further querying.

Tip: Load multiple Parquet files from your data lake, join them in the Join Builder, then write the joined result back to S3/GCS via the ETL tab — full pipeline in the dashboard, no code.

ML Scoring Guru

Train and score machine learning models directly on any loaded table — no separate ML platform required. The ML tab in the dashboard provides end-to-end model training and inference.

Supported models

Classification: Random Forest, Gradient Boosting, Logistic Regression
Regression: Random Forest, Gradient Boosting, Linear Regression

Workflow

Select the target column (what you want to predict)
Select feature columns (numeric columns are used automatically)
Choose task type (Classification or Regression) and model
Set train/test split percentage
Click Train & Score

Output

Model accuracy (classification) or R² + MAE (regression) on the test split
Feature importance bar chart (Random Forest and Gradient Boosting)
All rows scored and saved as a new table: <table>_scored with column ddm_prediction

Example: Load a customer churn CSV, select churned as the target, train a Random Forest — the dashboard writes customers_scored with a churn probability for every customer. Export to Parquet and write back to S3 — all in the dashboard.

RAM & Performance

Performance scales with your cloud instance size. The analytics engine uses up to 85% of available instance RAM by default. Right-size your instance to your workload — you can scale up or down at any time.

A RAM gauge in the top-right of the dashboard header shows current usage. If RAM is running low, clear tables you no longer need before loading more files.

Instance sizing guide: A 4 vCPU / 16 GB instance handles hundreds of millions of rows comfortably. For billion-row workloads, use 16+ vCPU / 64 GB RAM with SSD-backed storage. A 10 GB Parquet file on a 32 GB instance is routine.

// CLOUD INSTANCE

Cloud Analytics Instance Guru

When you sign up, a dedicated GCP analytics instance is provisioned automatically — no command required. The full Duck Data Master stack is live in under 5 minutes.

What gets provisioned

GCP Compute Engine VM (your choice of tier — Starter 4 vCPU to Guru 176 vCPU)
Analytics engine with all extensions: httpfs, spatial, delta, iceberg
Python virtual environment — pandas, pyarrow, polars, plotly, scikit-learn, dilithium-py
12-tab FastAPI + Vanilla JS analytics dashboard (Duck Master AI built in)
JupyterLab for custom notebook pipelines
Caddy + Let's Encrypt TLS — your instance gets a subdomain at uid.inst.duckdatamaster.guru
Dedicated GCS bucket auto-connected via the instance's service account
systemd services — dashboard on :8000, Jupyter on :8888, Caddy reverse-proxy

Supported tiers

Starter (c3-standard-4 · 4 vCPU · 16 GB) up to Guru Top (c3-standard-176 · 176 vCPU · 704 GB). All Intel Sapphire Rapids. Scale up or down at any time from the portal.

Start / Stop Your VM Guru

Your portal shows your instance state in real time — LIVE, STOPPING, or STOPPED. The portal auto-refreshes every 30 seconds so you never need to reload the page to see a state change.

Stopping your instance

Click Stop Instance in the portal. The VM shuts down within seconds. You are no longer billed for compute — only the disk charge continues (~$8/mo). Stop your instance whenever you're done for the day.

We actively encourage you to stop your instance when you're not using it. Our revenue doesn't depend on you forgetting. Stop it, save money.

Starting your instance

When your instance is stopped, the dashboard button changes to ▶ Start VM + Open Dashboard. Click it — the portal starts your VM, polls every 8 seconds, and restores the button to Open Analytics Dashboard ↗ as soon as the instance is live. Boot time is typically 20–40 seconds.

Crash recovery

If your instance crashes unexpectedly (not a manual stop), the system detects it within 15 minutes and restarts it automatically. You receive an email notification when this happens. Manual stops are never auto-restarted.

Health-check reboot

If your instance is running but the analytics dashboard stops responding, the system performs a hard reboot automatically — stop then start — and notifies you by email. This clears stuck processes without data loss.

Monthly Budget Control Guru

Set a monthly compute spending limit directly in the portal. The system enforces it automatically — no surprise bills.

Setting your budget

In the portal, locate the Monthly Budget gauge
Click the amber dollar amount (e.g. $100 ✎)
Type your new monthly limit and press Enter
The gauge rescales immediately to reflect your new limit

Budget can be set anywhere from $1 to $10,000. Changes take effect immediately — the next patrol cycle (within 15 minutes) will enforce the new value.

How enforcement works

$5 remaining — you receive a warning email. Instance keeps running.
$0 remaining (limit reached) — instance is stopped automatically and you receive a notification email. No further compute charges accrue.
To resume: raise your budget in the portal, then start your instance manually.

Budget resets on the 1st of each month when a new billing period begins. The warning flag also resets — you'll receive a fresh $5 warning if you approach the limit again the following month.

Compute cost reference

Compute is billed at GCP list price + 10% — exactly what Google charges, passed through with a 10% margin. Approximate hourly rates by instance tier:

Tier	vCPU / RAM	Rate	~Daily (8 hrs)
Starter	4 vCPU · 16 GB	$0.19/hr	~$1.56
Standard	8 vCPU · 32 GB	$0.39/hr	~$3.12
Pro	22 vCPU · 88 GB	$1.07/hr	~$8.58
Power	44 vCPU · 176 GB	$2.14/hr	~$17.15
Ultra	88 vCPU · 352 GB	$4.29/hr	~$34.30
Guru	176 vCPU · 704 GB	$8.57/hr	~$68.60

Stop your instance when you're done. A stopped instance costs ~$8/mo in disk only — zero compute.

Duck Master AI on Your Instance Guru

Duck Master AI is built into the dashboard — accessible from the NL bar at the top of every tab and from the dedicated Chat tab. It runs against your instance's data via a persistent API key written at provisioning time.

Only your question and table schema (column names and types) are sent to the AI backend. Your actual data rows never leave your instance.

// SECURITY

Post-Quantum Signed Exports Guru

Every Guru cloud instance ships with an ML-DSA-65 signing keypair (NIST FIPS 204 — post-quantum secure). You can sign any exported file with one click and provide tamper-evident proof of data provenance to any downstream system.

Keypair location

~/.duckpqc/signing.sec — secret key (mode 600 — never share)
~/.duckpqc/duckpqc.pub — public key (share with clients for verification)

Keys persist across dashboard restarts. Use the PQC Sign tab to manage the full keypair lifecycle without touching the command line.

Keypair lifecycle (PQC Sign tab)

⚡ Generate Keypair — creates a new ML-DSA-65 keypair on your instance
↻ Rotate — generates a new keypair, overwriting the old one (recipients will need the new public key)
↑ Save to Bucket — backs up the public key to your GCS bucket for safe-keeping
↓ Restore from Bucket — restores a previously saved keypair from GCS
🗑 Delete Keys — removes both keys from the instance (use before rotating on a shared system)
Copy — copies the public key to your clipboard for sharing with recipients

Signing an export

Open the Export tab in the dashboard
Choose your format (CSV, Parquet, or JSON)
Check Sign with ML-DSA-65 (NIST FIPS 204 · post-quantum)
Click Prepare download
Download the data file and the .sig file — send both to your recipient along with duckpqc.pub

Signature format

DUCKPQC-SIG-v1
file=report.parquet
ts=1748000000
sha256=<sha256-hex-of-file>
sig=<ml-dsa-65-hex-signature>

The signature covers: sha256(file) | filename | unix_timestamp — making each signature file-specific and time-stamped. Replay attacks are impossible.

What ML-DSA-65 means

NIST FIPS 204 — standardized post-quantum digital signature algorithm
Security level 3 — equivalent to AES-192, resistant to both classical and quantum attacks
Signature size: ~3.3 KB. Public key: ~1.3 KB. Verification is fast (<1ms).
No PKI, no certificate authority, no certificate chain — self-sovereign key management

Use cases

Client deliverables — proof the report came from your instance, unmodified
Regulatory audit trail — time-stamped, cryptographically verifiable data provenance
Data supply chain — downstream systems verify inputs before processing
Compliance — demonstrate data integrity without sharing raw data

No other analytics platform at this price point offers post-quantum signed exports. Databricks and Snowflake do not include this. This is a Guru-exclusive feature, included at no extra charge.

// ACCOUNT

Privacy, Security & Compliance

Duck Data Master is a SaaS product. Your dedicated analytics instance runs in Google Cloud Platform (GCP) infrastructure managed by Duck Data Master — you do not need a GCP account. Your instance is isolated; no other customer shares your compute, memory, or storage.

Data flow

Data files — uploaded directly into the analytics engine on your dedicated instance. They never touch Duck Data Master application servers — only your dedicated compute node.
SQL execution — runs entirely on your dedicated instance. Query results are returned to your browser session.
AI queries — your plain-English question and your table schema (column names and types only — never data rows) are sent to Google AI (Vertex AI / Gemini) via our Cloud Run backend. Your actual data is never sent to the AI.
Cloud storage credentials — used only to authenticate your instance against your S3/GCS/Azure bucket. Credentials are used per-session and are never stored by Duck Data Master.
Exports — generated on your instance, transferred directly to your browser. Never routed through our application servers.

GCP infrastructure & billing

Your dedicated VM runs in a Duck Data Master GCP project (Google Cloud Platform billing account). You are billed for compute at GCP list price + 10% markup via Stripe, transparently. You do not need your own GCP account — we provision, manage, and maintain the infrastructure for you.

GCP data center: Your instance runs in us-central1 (Council Bluffs, Iowa, USA) by default. Google Cloud's data centers are physically secured, SOC 2 audited, and ISO 27001 certified.

Infrastructure compliance

Duck Data Master runs on Google Cloud Platform infrastructure, which holds the following certifications and authorizations. These apply to the underlying infrastructure — not Duck Data Master as an application:

SOC 2 Type II — GCP data centers are independently audited for security, availability, and confidentiality controls
ISO 27001 — GCP holds ISO 27001 information security management certification
PCI DSS Level 1 — GCP infrastructure meets PCI DSS standards (note: Duck Data Master does not process cardholder data — Stripe handles all payment processing)
HIPAA Eligible — GCP offers HIPAA-eligible services and executes BAAs with qualifying customers. Duck Data Master does not execute BAAs directly; if your workload requires a BAA, contact us to discuss your options
FedRAMP — Certain GCP services are FedRAMP authorized. Duck Data Master is not itself a FedRAMP-authorized product

Duck Data Master data practices

Your data rows are never stored by Duck Data Master application systems — only on your dedicated instance
Your dedicated instance is deleted when your subscription ends (data is your responsibility to export first)
Duck Data Master does not sell, share, or analyze your data
All traffic between your browser and your instance is TLS-encrypted (Let's Encrypt certificate, auto-renewed by Caddy)

Privacy by isolation, not policy. One customer per VM. Your compute, memory, and GCS bucket are isolated from all other customers by GCP's virtualization layer. There is no shared multi-tenant database holding your data.

Billing & Cancellation

Plan

Guru — $99/month platform fee + GCP compute at cost + 10%. Dedicated GCP analytics instance, auto-provisioned. 12-tab analytics dashboard + JupyterLab. All file formats. S3/GCS/Azure connectors. Spatial analysis, Delta/Iceberg, Fuzzy Match, ML Scoring, PQC signed exports. Direct engineer support.

3-day free trial. No credit card required to start. Full Guru access from minute one.

Compute billing

Compute is billed at GCP list price + 10%, charged via Stripe. You are only billed for hours your instance is running. Set a monthly budget cap in the portal — the instance stops automatically when you hit your limit. See Monthly Budget Control for details.

Cancelling

You can cancel at any time from the account screen inside the app. After cancellation, access continues through the end of your current billing period. No cancellation fees.

Payments

Payments are processed by Stripe. Duck Data Master never stores your card details.

Support

Email support@duckdatamaster.guru — most issues are resolved automatically within minutes. Complex questions reach the engineer directly.

Guru subscribers get priority response. For billing issues, include your account email.

Start Free Trial →

Documentation — Duck Data Master Analytics Platform

Quick Start

Sign Up & Trial

Starting your trial

Ingest Tab

GCS Bucket File Manager Guru

Navigation

Operations

Extract Tab Guru

Spatial Analytics

Delta Lake & Apache Iceberg

Direct Cloud Query (httpfs)

File Upload

How it works

Table naming

Loading multiple files

Load from URL

Cloud Storage Connectors Guru

Amazon S3

Google Cloud Storage

Azure Blob Storage

Supported File Formats

Query Tab — SQL NL Mode

How it works

Example questions

Results panel

AI query limits

Query Tab — Python NL Mode

How it works

Pre-loaded in every Python execution

Example questions

Sending results to Notebook

AI query limits

Chat Tab — Duck Master AI

How to use it

Pipeline walkthrough

What Duck Master can do

Transform Tab — SQL Editor

Keyboard shortcuts

SQL Reference

Window functions

PIVOT

CTEs

Other capabilities

Export Tab

Download formats

Write back to cloud storage

PQC-signed exports

Fuzzy Match Tab Guru

When to use it

Workflow

AI Notebook Tab Guru

Jupyter keyboard shortcuts

Cell features

AI cell assist

Save & export

Join Builder Guru

ML Scoring Guru

Supported models

Workflow

Output

RAM & Performance

Cloud Analytics Instance Guru

What gets provisioned

Supported tiers

Start / Stop Your VM Guru

Stopping your instance

Starting your instance

Crash recovery

Health-check reboot

Monthly Budget Control Guru

Setting your budget

How enforcement works

Compute cost reference

Duck Master AI on Your Instance Guru

Post-Quantum Signed Exports Guru

Keypair location

Keypair lifecycle (PQC Sign tab)

Signing an export

Signature format