Filings in your lakehouse,
via Delta Sharing.
The full FinancialReports corpus delivered through the open Delta Sharing protocol. Read directly from Databricks, Spark, pandas, or any compatible client — no data duplication, no ETL to build.
FinancialReports docs
What Delta Sharing delivery is, in three lines.
Same data as the API and S3 bucket. Delivered via the open Delta Sharing protocol — works with Databricks Unity Catalog, Apache Spark, pandas, and any Delta Sharing-compatible client.
A share profile and a bearer token.
Delta tables for filings, companies, and reference data. In Databricks, they appear as a catalog in Unity Catalog. Outside Databricks, use the open-source delta-sharing Python client or any compatible connector.
Teams running Spark or lakehouse architectures.
ML engineers training models on filing text. Data engineers building medallion-architecture pipelines. Quant teams running PySpark notebooks. If your compute is Spark-native, Delta Sharing is the zero-copy path.
Annual subscription. You pay your compute.
The data subscription is flat. You pay Databricks (or your Spark vendor) for compute. Delta Sharing reads are serverless on the provider side — no warehouse to spin up on our end.
Three bulk channels, one dataset.
Choose by infrastructure fit. All three share the same filing IDs and schema — you can mix channels or migrate between them without data re-mapping.
From zero to your first query.
Four steps. Works with Databricks Unity Catalog or the open-source delta-sharing Python package. No cloud-specific setup required on your side.
Request access
Contact us or email [email protected]. We'll provision a Delta Sharing share scoped to your contract and send you a share profile file (.share).
Import the share profile
Databricks: Add the share as a provider in Unity Catalog via Data → Delta Sharing → Providers. Open-source: Save the .share file and point the Delta Sharing client at it.
Create a catalog (Databricks) or load tables (OSS)
Databricks: Create a catalog from the share — tables appear in Unity Catalog alongside your own data. OSS: Use delta_sharing.load_as_spark() or delta_sharing.load_as_pandas().
Query
The tables are live — new filings appear daily. Query with SQL, PySpark, or pandas. See the examples on the right.
-- After creating a catalog from the share: -- Latest German annual reports SELECT filing_id, company_name, title, release_datetime, fiscal_year FROM financialreports.default.filings WHERE country_code = 'DE' AND filing_type_code = '10-K' AND release_datetime >= DATE_SUB(CURRENT_DATE(), 365) ORDER BY release_datetime DESC LIMIT 25; -- Cross-catalog join to your own tables SELECT h.portfolio, f.company_name, f.title, f.release_datetime FROM financialreports.default.filings f JOIN my_catalog.default.holdings h ON f.company_id = h.fr_company_id WHERE f.release_datetime >= DATE_SUB(CURRENT_DATE(), 7) ORDER BY f.release_datetime DESC;
Three clients, same data.
Delta Sharing is an open protocol. Use it from Databricks SQL, PySpark notebooks, or any machine with the delta-sharing Python package. No Databricks account required for the open-source client.
-- Unity Catalog: shared tables are first-class SELECT company_name, country_code, COUNT(*) AS n_filings, MAX(release_datetime) AS latest FROM financialreports.default.filings WHERE release_datetime >= '2026-01-01' AND country_code IN ('DE', 'FR', 'GB') GROUP BY company_name, country_code ORDER BY n_filings DESC LIMIT 20;
# In a Databricks notebook or any Spark cluster filings = spark.table("financialreports.default.filings") # Filter and join with internal data eu_annuals = (filings .filter("country_code IN ('DE','FR','GB','IT','ES')") .filter("filing_type_code = '10-K'") .filter("fiscal_year = 2025")) holdings = spark.table("my_catalog.default.portfolio") (eu_annuals.join(holdings, "company_id", "inner") .select("company_name", "title", "release_datetime") .orderBy("release_datetime", ascending=False) .show(20, truncate=False))
import delta_sharing # Works anywhere — no Databricks account needed profile = "financialreports.share" # List available tables client = delta_sharing.SharingClient(profile) tables = client.list_all_tables() for t in tables: print(f"{t.share}.{t.schema}.{t.name}") # Load filings into pandas table_url = f"{profile}#financialreports.default.filings" df = delta_sharing.load_as_pandas(table_url) # Filter locally de_annuals = df[ (df["country_code"] == "DE") & (df["filing_type_code"] == "10-K") ].sort_values("release_datetime", ascending=False).head(25)
Open standard. Not vendor lock-in.
Delta Sharing is an open protocol by the Linux Foundation. The data is Parquet-backed, the API is REST, and the client libraries are open source. You can read FinancialReports data from any compatible tool — not just Databricks.
Linux Foundation, Apache 2.0.
The protocol spec, reference server, and all client libraries are open source under the Linux Foundation's Delta Lake project. No proprietary extensions required.
Python, Spark, pandas, R, Rust.
The open-source delta-sharing Python package works anywhere — laptops, CI/CD, cloud VMs. Databricks Unity Catalog provides native integration. Apache Spark reads shares natively.
REST API with pre-signed Parquet URLs.
The sharing server returns pre-signed URLs to Parquet files. Your client downloads directly from object storage — no data passes through a proxy. Predicate pushdown and column pruning are protocol-native.
Token-scoped, encrypted, audit-logged.
Every share is scoped to the tables and columns in your contract. Bearer tokens are rotatable, and all access is logged.
Bearer token per share profile.
Each client gets a unique .share profile with a bearer token. Tokens are rotatable — request a new one anytime without disrupting queries in flight.
Table and column level access.
Your share exposes only the tables and columns in your contract. Geography filters are applied at the share definition — not at query time. You can't access data outside your scope.
TLS everywhere. Parquet at rest.
The sharing server API and all pre-signed Parquet URLs use TLS. Data at rest is encrypted with SSE-S3. Pre-signed URLs expire after a short window.
Questions every prospect asks.
If yours isn't here, email [email protected].
Do I need a Databricks account?
No. Delta Sharing is an open protocol. The delta-sharing Python package works on any machine — pip install delta-sharing, point it at the .share profile, and load tables into pandas or Spark. Databricks just makes it easier with Unity Catalog integration.
How does latency compare to Snowflake delivery?
Snowflake delivery runs hourly (~1–2 h latency). Delta Sharing delivery runs daily (~24 h latency). If you need sub-day latency in a lakehouse, pair Delta Sharing with the API or webhooks for real-time event triggers.
Can I get just EU data?
Yes. Your share is scoped to the geography in your contract — EU-only, global, single-country, or any combination. Adjusting scope later is a share configuration update on our side.
What tables are included?
The same six tables as Snowflake: filings, companies, sources, filing_types, filing_categories, and languages. Schema is identical across all delivery channels.
Can I access the original filing documents (PDFs)?
The Delta table includes raw_document_s3_key — the S3 path to the original file. We provision a cross-account IAM grant for document access, same as S3 bulk delivery. See S3 security docs.
Is this the same data as the API and S3?
Exactly the same. Filing IDs match across all channels. A filing's filing_id in the Delta table is the same as the API's id field and the S3 Parquet's filing_id column. Mix and match channels freely.
How does S3 bulk differ from Delta Sharing?
S3 bulk gives you raw Parquet files and Markdown documents on S3 — you own the query engine choice. Delta Sharing provides a structured table abstraction with predicate pushdown and column pruning built into the protocol. If you already use Databricks or Spark, Delta Sharing is simpler. If you use DuckDB or Athena, S3 is more natural.
Ready to add regulatory filings to your lakehouse?
Tell us your Databricks workspace URL (or just that you want Delta Sharing) and the geography you need. We'll provision a share profile within a business day.