.zip archives containing shapefiles, CSVs, XML reports, or proprietary formats. Leaf identifies the format, extracts the soil data, and returns a flat GeoJSON result plus a canonical JSON result when that output is available.
How it works
The service uses an asynchronous batch model. You upload one or more files, then poll for results.- Upload one or more
.zipfiles to the batch endpoint along with a Leaf user ID. Each file becomes an entry within the batch. - Processing. Leaf classifies the file format, extracts soil sample data, normalizes analyte names and units, and produces both output formats.
- Retrieval. Poll the batch status endpoint. When an entry reaches
COMPLETED,downloadStandardGeojsoncontains the flat GeoJSON result URL.downloadCanonicalJsoncontains the hierarchical result URL when that output is available.
COMPLETED, PARTIALLY_COMPLETED, or FAILED.
Key concepts
A batch is a container for one or more uploaded files, created by a single POST request. Its status reflects the aggregate state of all entries. An entry is one file within a batch. Each entry is processed independently. A batch with three files has three entries, each of which may complete or fail on its own. Entries move throughPROCESSING → COMPLETED or FAILED. Batches follow the same pattern, plus PARTIALLY_COMPLETED when some entries succeed and others fail.
| Status | Level | Meaning |
|---|---|---|
PROCESSING | Entry / Batch | Upload received, conversion in progress |
COMPLETED | Entry / Batch | All entries converted successfully |
PARTIALLY_COMPLETED | Batch only | Some entries completed, some failed |
FAILED | Entry / Batch | Conversion failed (check errorMessage) |
Output formats
Completed entries return result URLs in the API response. All entries that reachCOMPLETED will have a downloadStandardGeojson URL. Most supported formats also produce downloadCanonicalJson; for formats that don’t, the field is null.
downloadStandardGeojson is a flat GeoJSON FeatureCollection. Each Feature represents one soil sample at one depth. A sample with multiple depth layers (e.g., surface + subsoil) produces multiple Features sharing the same sampleId. This format works well for mapping, GIS tools, and spatial queries.
When present, downloadCanonicalJson is the full hierarchical data model. It contains an array of SoilSamplingEvent objects preserving the natural tree structure: event → samples → depth layers → analyte results. It includes context not present in the GeoJSON: lab information, provenance (source file, format family, converter version), fertilizer recommendations, and category classification for each analyte result. Use this format when you need the complete data model or when you’re building data pipelines that benefit from structured nesting.
GeoJSON properties
| Property | Type | Description |
|---|---|---|
eventId | string | Unique identifier for the sampling event |
eventDate | string | Sampling date (YYYY-MM-DD), or null |
eventCode | string | Lab report number or job ID, or null |
sampleId | string | Unique identifier for this sample |
sampleNumber | string | Lab sample number (e.g. “1”, “A-1”), or null |
depthLabel | string | Human-readable depth (e.g. “0-6 in”), or null |
depthTop | number | Top of sampling depth, or null |
depthBottom | number | Bottom of sampling depth, or null |
depthUnit | string | Depth unit (“in” or “cm”), or null |
growerName | string | Grower name, or null |
farmName | string | Farm name, or null |
fieldName | string | Field name, or null |
growerName, farmName, fieldName) are omitted entirely when the source has no field metadata.
Analyte properties
Each analyte result adds up to three properties per Feature:| Pattern | Type | Description |
|---|---|---|
{analyte} | number | The measured value (e.g. pH, P, K, OM) |
{analyte}_unit | string | Unit of measurement, present when known |
{analyte}_method | string | Extraction method, present when known |
"pH": 6.4.
Common analytes
| Analyte | Property | Typical units |
|---|---|---|
| pH | pH | (unitless) |
| Organic matter | OM | % |
| Phosphorus | P | ppm |
| Potassium | K | ppm |
| Calcium | Ca | ppm |
| Magnesium | Mg | ppm |
| Cation exchange capacity | CEC | meq/100g |
| Buffer pH | BpH | (unitless) |
| Nitrate-nitrogen | NO3_N | ppm |
GeoJSON example
A two-sample FeatureCollection from a shapefile with Mehlich-3 extraction:Canonical JSON structure
The canonical JSON output is an array ofSoilSamplingEvent objects. Each event represents one sampling trip, date, or lab report from the source file.
AnalyteResult in the canonical format includes a category field that classifies the measurement:
| Category | Meaning | Examples |
|---|---|---|
analyte | Direct lab measurement | pH, P, K, Ca, Mg, OM |
index | Calculated index | P-Index, K-Index |
derived | Ratio or derived value | BS-Ca, BS-K, SAR |
sensor | Field instrument reading | EC (Veris), Red, IR |
passthrough | Unrecognized column preserved as-is | Varies |
provenance block on every event records exactly which source file and converter produced the output. Useful for auditing and tracing data lineage.
File size limits
| Limit | Value |
|---|---|
| Maximum file size | 50 MB per file |
| Maximum request size | 200 MB per request |
What to do next
- Supported Formats — Full catalog of accepted soil data formats.
- API Reference: Soil Sampling — Endpoint reference for uploading files, checking status, and retrieving results.

