Leaf Lake Overview - Leaf Documentation

Leaf Lake gives you direct SQL access to all of the agronomic data flowing through Leaf. Instead of downloading GeoJSON files and processing them locally, you send a SQL query to a single endpoint and get back JSON rows. The data in Leaf Lake comes from the same pipeline that produces Leaf’s standardGeojson files. Every point from every machine file — planting, application, harvest, tillage — is available as a row in the points table. Leaf also provides built-in environmental datasets: USDA SSURGO soil survey polygons and US state/county boundaries, ready for spatial joins.

How it works

Authenticate — Get a Bearer token from the Leaf API using your existing credentials. Leaf Lake uses the same authentication as all other Leaf endpoints.
Write SQL — Compose a query against the available tables. Leaf Lake uses BigQuery SQL with aggregations, CTEs, JOINs, and spatial functions.
POST the query — Send the SQL as the request body to the query endpoint. The response is a JSON array of result rows.

All queries are automatically scoped to the authenticated API owner. You only see your own data. Leaf Lake is read-only — only SELECT queries (including WITH, ORDER BY, UNION) are supported. No writes, updates, or schema changes.

Authenticate  →  Write SQL  →  POST /query  →  JSON rows

Data freshness

Leaf Lake’s ingestion pipeline runs continuously. New machine data typically appears in the points table within minutes of being processed by the standard Leaf pipeline. If a file has been converted and a standardGeojson exists, the data is on its way into the lake.

Available tables

Table	Contents	Geometry type
`points`	All operation data (planted, applied, harvested, tillage) from Leaf’s machine data pipeline	Point
`fields`	Leaf field boundaries synced from the Fields API	Polygon
`ssurgo`	USDA Soil Survey Geographic Database — soil map units with properties like drainage class, farmland classification, and soil name	Polygon
`states_counties`	US state and county boundaries	Polygon

The `points` table

Each row is a single data point from a machine file. The columns match the properties in Leaf’s standardGeojson output, which vary by operation type. See Sample Output for the full property reference. Each point includes a fieldIds column containing the Leaf field UUID(s) associated with that point. You can filter directly by field ID instead of providing a WKT boundary polygon. Points also carry leafUserId for user-level filtering. Common columns across all operation types: operationType, geometry, timestamp, crop, fieldIds, leafUserId, area, distance, heading, speed, elevation, equipmentWidth, recordingStatus, sectionId, machinery. Type-specific columns include seedRate and variety for planted, wetMassPerArea and harvestMoisture for harvested, appliedRate and products for applied, and tillageDepthTarget for tillage. The full breakdown is covered in Querying Leaf Lake.

The `fields` table

Leaf field boundaries, synced from the Fields API. Each row is a field with its merged boundary geometry. Automatically scoped to your API owner account. Use this table to spatially join operations with field boundaries without hardcoding WKT.

The `ssurgo` table

SSURGO data from the USDA provides soil map unit polygons with attributes like soil name, drainage class, and farmland classification. Use spatial joins to relate operation data to soil properties — for example, calculating average yield by soil type within a field.

The `states_counties` table

US state and county boundary polygons. Useful for grouping operations by geography or filtering to a region.

Spatial SQL functions

Leaf Lake uses BigQuery SQL with geography extensions. Geometry columns are stored as text and converted using ST_GeogFromText.

Function	Description
`ST_GeogFromText('WKT')`	Creates a geography from Well-Known Text
`ST_Intersects(geog1, geog2)`	Returns true if two geographies share any space
`ST_X(point)` / `ST_Y(point)`	Extracts longitude / latitude from a point
`ST_Area(geog)`	Returns the area of a polygon in square meters
`ST_Intersection(geog1, geog2)`	Returns the geography where two geographies overlap
`ST_AsText(geog)`	Converts a geography to Well-Known Text
`ST_ConvexHull(geog)`	Returns the convex hull of a geography
`ST_Union_Agg(geog)`	Aggregates multiple geographies into one

Common use cases

Yield by soil type: Join harvest points with SSURGO polygons to see how yield varies across soil map units within a field.
Seed variety comparison: Query planting data grouped by variety and year to compare seed rate distributions and coverage.
Multi-year trend analysis: Aggregate harvest data across seasons to track yield trends without downloading and processing files.
Product performance: Join application and harvest data to compare treated vs. control areas within a field.
Instant reprocessing: Change filtering criteria (outlier thresholds, moisture cutoffs) by modifying the SQL WHERE clause instead of reprocessing files through the pipeline.
Regional analysis: Join operations with state/county boundaries to aggregate data by geography.

What to do next

Querying Leaf Lake — Schema reference and example queries for each operation type, SSURGO, and state/county data.
API Reference: Leaf Lake — Endpoint details, request/response format, and code examples.
Authentication — How to get a Bearer token for API requests.
Sample Output — Full property reference for standardGeojson point data by operation type.

​How it works

​Data freshness

​Available tables

​The points table

​The fields table

​The ssurgo table

​The states_counties table

​Spatial SQL functions

​Common use cases

​What to do next