Skip to main content
This page covers what happens after Leaf receives a machine data file, whether from a provider connection or a manual upload. Every file passes through the same conversion pipeline, producing standardized point data and a summary.

Pipeline stages

Each file moves through these stages in order:
  1. originalFile — The raw proprietary file as received. Stored for reference.
  2. rawGeojson — The proprietary format is parsed into Leaf’s raw GeoJSON representation.
  3. standardGeojson — The point data is standardized to Leaf’s public schema and units. This is the primary point-level output returned by the file resource.
  4. filteredGeojson — If filtered output is available, Leaf exposes a filtered point dataset as a separate public artifact rather than as a cleanup step name.
  5. summary — Aggregate statistics (avg, min, max, totals) are calculated from the point data. Includes a geometry representing the spatial coverage.
  6. units — A map of property names to their units for the file.
  7. propertiesPNGs — PNG images generated from numeric properties.
  8. zippedPNGs — A zip bundle containing the generated PNG images.
Each stage runs independently and has its own status.

Tracking file status

Use GET /files/{id}/status to check where a file is in the pipeline. The response is a map keyed by the public step names returned by the API:
{
  "originalFile":     { "status": "processed", "message": "ok" },
  "rawGeojson":       { "status": "processed", "message": "ok" },
  "standardGeojson":  { "status": "processed", "message": "ok" },
  "filteredGeojson":  { "status": "processed", "message": "ok" },
  "propertiesPNGs":   { "status": "processed", "message": "ok" },
  "zippedPNGs":       { "status": "processed", "message": "ok" },
  "summary":          { "status": "processed", "message": "ok" },
  "units":            { "status": "processed", "message": "ok" }
}
Some keys may be absent when a file has not reached that step or when that output is not produced for the file. Possible status values: processed, failed, skipped. If a stage fails, the message field contains details about what went wrong.

File metadata

A converted file (GET /files/{id}) includes:
FieldDescription
idUnique file ID
leafUserIdOwner Leaf user
providerSource provider (or Leaf for manual uploads)
fileFormatOriginal format (e.g., AGDATA, CN1, ISO11783, SHAPEFILE)
fileNameOriginal file name
operationTypeplanted, harvested, applied, or tillage
downloadOriginalFileAuthenticated download URL for the original proprietary file
downloadStandardGeojsonAuthenticated download URL for the standardized data
summaryEmbedded summary object with aggregate stats and geometry
fieldsField IDs this file has been matched to
sourceFilesIf this file was created by merging, the IDs of the source machine files
batchIdIf uploaded via batch API, the batch ID
Always use the download-prefixed URLs (e.g., downloadStandardGeojson) for file downloads. These point to api.withleaf.io and require authentication. Direct S3 URLs are being deprecated.

File summary

The summary (GET /files/{id}/summary) is a GeoJSON Feature with aggregate properties and a geometry representing the spatial coverage of the operation. The properties vary by operation type. Common properties across all types:
PropertyDescription
operationTypeplanted, harvested, applied, or tillage
startTime / endTimeTime range of the operation
totalAreaTotal area covered
totalDistanceTotal distance traveled
elevationElevation statistics
speedSpeed statistics
cropCrop type(s)
machineryMachine and implement info
originalOperationTypeThe operation type as reported by the provider
totalFuelUsedTotal fuel consumed (when available)
The summary geometry is built from a buffer of the operation points, creating a polygon that approximates the coverage area.

Data cleanup

When cleanupStandardGeojson is enabled, Leaf removes points that fail these validation rules:
PropertyValid when
wetMass> 0.0
wetMassPerArea> 0.0
wetVolume> 0.0
wetVolumePerArea> 0.0
harvestMoisture> 0.0 and < 100.0
appliedRate> 0.0
seedRate> 0.0
tillageDepthActual>= 0.0
recordingStatus= “On”
crop!= “unknown”
products>= 0.0
You can customize which rules apply and their thresholds using the cleanupRules configuration.

Filtered GeoJSON and outlier removal

If operationsFilteredGeojson is enabled, Leaf produces an additional filtered version of the data. The filter removes:
  • Points with speed less than 0.5 m/s (all operation types)
For harvest data, outlier removal can also be applied. Points where the harvested volume is more than 3 standard deviations from the mean are excluded. This threshold is configurable via operationsOutliersLimit. Disable outlier removal entirely with operationsRemoveOutliers. At the operation level, the filtered GeoJSON is used as the basis for generating V2 images, which use a fixed color ramp with 7 quantile-based classes. See Field Operations for details on operation images.

Processing timing

Files from provider connections process immediately on first sync, then at least every 24 hours. Event-driven providers trigger processing sooner. Manually uploaded files begin processing as soon as Leaf receives the upload. Processing time depends on data volume. Expect initial results within a few minutes.
Leaf archives files to slower storage after 180 days of no access. Contact support if you need to retrieve archived files or require a different retention period.

What to do next

  • Field Operations — How converted files are merged into field operations.
  • Sample Output — Example file and operation responses.
  • Units — Unit reference for all numeric properties.
Last modified on March 24, 2026