Machine Data Overview - Leaf Documentation

Leaf turns raw machine data into standardized, analysis-ready field operations. Data comes in from two sources: direct provider connections (John Deere, Climate FieldView, CNHi, CNHI FieldOps, Ag Leader, Trimble, Raven Slingshot) and manual file uploads. Regardless of the source, every file moves through the same processing pipeline.

The pipeline

Machine data flows through four stages:

Ingestion — Leaf receives proprietary-format files, either pulled from a connected provider or uploaded manually via the batch API.
Conversion — Each file is converted from its native format into Leaf’s standard canonical format (standardGeojson). Optionally, a filteredGeojson is produced with low-speed points and outliers removed. A summary with aggregate statistics (averages, min/max, totals) is generated for each file.
Merge — Files that belong to the same task (planting, harvest, application, or tillage) and overlap the same field boundary are grouped and merged into a single field operation. One operation may contain hundreds of machine files.
Output — Each field operation includes a standardGeojson, optional filteredGeojson, images (PNG or GeoTIFF), and a summary with geometry.

Provider / Upload  →  Machine Files  →  Field Operations
                        (per file)        (per field + task)

Key concepts

Machine file — A single converted file. It has its own summary, standardGeojson, and unit map. Machine files are the building blocks of field operations. You can list and query them via the /files endpoints. Field operation — The result of merging one or more machine files against a field boundary. Operations represent a real-world activity on a specific field: planting corn, harvesting soybeans, spraying herbicide, or running a tillage pass. You can list and query them via the /operations endpoints. Field boundaries are required for operations. Without boundaries, Leaf still converts machine files and produces file-level summaries, but it cannot create field operations. Make sure boundaries exist before expecting operations to appear. Operation types — Leaf recognizes four types: planted, harvested, applied, and tillage. The data properties available on summaries and point data vary by type.

Processing timing

Files from provider connections are pulled immediately on first sync and then at least every 24 hours. Providers with event-driven APIs (like John Deere) trigger processing sooner when new data arrives. Manually uploaded files begin processing as soon as Leaf receives them. The merge process that creates field operations runs continuously. Merging is computationally expensive and may take time for large datasets.

What each stage produces

Stage	Output	Endpoint
Conversion	standardGeojson, filteredGeojson (optional), summary	`/files/{id}`, `/files/{id}/summary`
Merge	Field operation with merged standardGeojson, summary with geometry, images	`/operations/{id}`, `/operations/{id}/summary`

Configuration

Leaf’s behavior is controlled by configurations set at the API owner or Leaf user level. Configurations affect which files are processed, how data is cleaned, and how operations are created. If a Leaf user has no custom configuration, it inherits the API owner’s settings. Key configurations for machine data:

cleanupStandardGeojson — Remove invalid points from the standardGeojson during conversion.
operationsFilteredGeojson — Generate a filtered version of the operation data with low-speed and outlier points removed.
operationsRemoveOutliers — Control whether statistical outlier removal is applied to harvest data.
outOfStandardOperations — Allow processing of operations that don’t meet standard validation criteria. These operations are marked as non-standard. Useful when you need data even from incomplete or edge-case files.
operationsProcessingRange — Limit how far back (in months) Leaf processes data from provider connections.
customDataSync — Restrict processing to specific fields.

Common use cases

Normalize multi-brand data: Pull machine files from John Deere, Climate FieldView, CNHi, Trimble, and AgLeader into one standard canonical format without writing per-provider parsers.
Build yield maps: Retrieve harvest field operations with per-point yield data and summary statistics, then render them in your application.
Collect planting records: Access planting field operations with seed rate, variety, and population data for compliance reporting or agronomic analysis.
Accept USB uploads: Let growers upload machine files from monitors via the batch API or Magic Link when they don’t use a cloud provider.

What to do next

Uploading Files — Manual upload via the batch API, supported formats, and file preparation by equipment type.
File Conversion — Pipeline stages, status tracking, and what each stage produces.
Field Operations — How files become merged operations and what data is available.
Sample Output — Example responses for files and operations across all operation types.
Units — Unit reference for all numeric properties.
API Reference: Operations — Full endpoint reference for field operations.
API Reference: Files — Full endpoint reference for machine files and batch uploads.

​The pipeline

​Key concepts

​Processing timing

​What each stage produces

​Configuration

​Common use cases

​What to do next