Soil Sampling Overview - Leaf Documentation

Leaf’s Soil Sampling service accepts soil lab data in over 30 file formats and normalizes it into a standard canonical format. You upload .zip archives containing shapefiles, CSVs, XML reports, or proprietary formats. Leaf identifies the format, extracts the soil data, and returns a flat GeoJSON result plus a canonical JSON result when that output is available.

The Soil Sampling service is currently available by invitation only.

How it works

The service uses an asynchronous batch model. You upload one or more files, then poll for results.

Upload one or more .zip files to the batch endpoint along with a Leaf user ID. Each file becomes an entry within the batch.
Processing. Leaf classifies the file format, extracts soil sample data, normalizes analyte names and units, and produces both output formats.
Retrieval. Poll the batch status endpoint. When an entry reaches COMPLETED, downloadStandardGeojson contains the flat GeoJSON result URL. downloadCanonicalJson contains the hierarchical result URL when that output is available.

Most files complete processing within a few minutes. Poll every 5 seconds for the first minute, then back off to every 15–30 seconds. Stop when the batch status is COMPLETED, PARTIALLY_COMPLETED, or FAILED.

Key concepts

A batch is a container for one or more uploaded files, created by a single POST request. Its status reflects the aggregate state of all entries. An entry is one file within a batch. Each entry is processed independently. A batch with three files has three entries, each of which may complete or fail on its own. Entries move through PROCESSING → COMPLETED or FAILED. Batches follow the same pattern, plus PARTIALLY_COMPLETED when some entries succeed and others fail.

Status	Level	Meaning
`PROCESSING`	Entry / Batch	Upload received, conversion in progress
`COMPLETED`	Entry / Batch	All entries converted successfully
`PARTIALLY_COMPLETED`	Batch only	Some entries completed, some failed
`FAILED`	Entry / Batch	Conversion failed (check `errorMessage`)

Output formats

Completed entries return result URLs in the API response. All entries that reach COMPLETED will have a downloadStandardGeojson URL. Most supported formats also produce downloadCanonicalJson; for formats that don’t, the field is null. downloadStandardGeojson is a flat GeoJSON FeatureCollection. Each Feature represents one soil sample at one depth. A sample with multiple depth layers (e.g., surface + subsoil) produces multiple Features sharing the same sampleId. This format works well for mapping, GIS tools, and spatial queries. When present, downloadCanonicalJson is the full hierarchical data model. It contains an array of SoilSamplingEvent objects preserving the natural tree structure: event → samples → depth layers → analyte results. It includes context not present in the GeoJSON: lab information, provenance (source file, format family, converter version), fertilizer recommendations, and category classification for each analyte result. Use this format when you need the complete data model or when you’re building data pipelines that benefit from structured nesting.

GeoJSON properties

Property	Type	Description
`eventId`	string	Unique identifier for the sampling event
`eventDate`	string	Sampling date (YYYY-MM-DD), or null
`eventCode`	string	Lab report number or job ID, or null
`sampleId`	string	Unique identifier for this sample
`sampleNumber`	string	Lab sample number (e.g. “1”, “A-1”), or null
`depthLabel`	string	Human-readable depth (e.g. “0-6 in”), or null
`depthTop`	number	Top of sampling depth, or null
`depthBottom`	number	Bottom of sampling depth, or null
`depthUnit`	string	Depth unit (“in” or “cm”), or null
`growerName`	string	Grower name, or null
`farmName`	string	Farm name, or null
`fieldName`	string	Field name, or null

Depth fields are all null when the source data does not specify sampling depth. Field context properties (growerName, farmName, fieldName) are omitted entirely when the source has no field metadata.

Analyte properties

Each analyte result adds up to three properties per Feature:

Pattern	Type	Description
`{analyte}`	number	The measured value (e.g. `pH`, `P`, `K`, `OM`)
`{analyte}_unit`	string	Unit of measurement, present when known
`{analyte}_method`	string	Extraction method, present when known

A Mehlich-3 phosphorus result at 42 ppm produces:

{
  "P": 42.0,
  "P_unit": "ppm",
  "P_method": "MEHLICH_3"
}

A pH value with no known method or unit produces just "pH": 6.4.

Common analytes

Analyte	Property	Typical units
pH	`pH`	(unitless)
Organic matter	`OM`	%
Phosphorus	`P`	ppm
Potassium	`K`	ppm
Calcium	`Ca`	ppm
Magnesium	`Mg`	ppm
Cation exchange capacity	`CEC`	meq/100g
Buffer pH	`BpH`	(unitless)
Nitrate-nitrogen	`NO3_N`	ppm

The full set of analytes depends on the input format and lab. Leaf normalizes over 750 column name variations into approximately 70 standard analyte properties.

GeoJSON example

A two-sample FeatureCollection from a shapefile with Mehlich-3 extraction:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [-89.4523, 40.1234]
      },
      "properties": {
        "eventId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "eventDate": "2025-10-15",
        "sampleId": "f1e2d3c4-b5a6-7890-fedc-ba0987654321",
        "sampleNumber": "1",
        "depthId": "d1a2b3c4-e5f6-7890-abcd-111111111111",
        "growerName": "Smith Farms",
        "farmName": "North 40",
        "fieldName": "Section 12",
        "pH": 6.4,
        "OM": 3.2,
        "OM_unit": "%",
        "P": 42.0,
        "P_unit": "ppm",
        "P_method": "MEHLICH_3",
        "K": 185.0,
        "K_unit": "ppm",
        "K_method": "MEHLICH_3",
        "CEC": 14.2,
        "CEC_unit": "meq/100g"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [-89.4531, 40.1242]
      },
      "properties": {
        "eventId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "eventDate": "2025-10-15",
        "sampleId": "f1e2d3c4-b5a6-7890-fedc-ba0987654322",
        "sampleNumber": "2",
        "depthId": "d1a2b3c4-e5f6-7890-abcd-222222222222",
        "growerName": "Smith Farms",
        "farmName": "North 40",
        "fieldName": "Section 12",
        "pH": 6.8,
        "OM": 2.9,
        "OM_unit": "%",
        "P": 38.0,
        "P_unit": "ppm",
        "P_method": "MEHLICH_3",
        "K": 210.0,
        "K_unit": "ppm",
        "K_method": "MEHLICH_3",
        "CEC": 15.1,
        "CEC_unit": "meq/100g"
      }
    }
  ]
}

Canonical JSON structure

The canonical JSON output is an array of SoilSamplingEvent objects. Each event represents one sampling trip, date, or lab report from the source file.

SoilSamplingEvent
├── field_context     (grower, farm, field names and IDs)
├── lab               (lab name, received/processed dates)
├── provenance        (source file, format family, converter version)
├── samples[]
│   ├── geometry      (GeoJSON Point or Polygon)
│   └── depths[]
│       └── results[] (analyte, value, unit, method, category)
└── recommendations[] (fertilizer recommendations, when present)

Each AnalyteResult in the canonical format includes a category field that classifies the measurement:

Category	Meaning	Examples
`analyte`	Direct lab measurement	pH, P, K, Ca, Mg, OM
`index`	Calculated index	P-Index, K-Index
`derived`	Ratio or derived value	BS-Ca, BS-K, SAR
`sensor`	Field instrument reading	EC (Veris), Red, IR
`passthrough`	Unrecognized column preserved as-is	Varies

The provenance block on every event records exactly which source file and converter produced the output. Useful for auditing and tracing data lineage.

File size limits

Limit	Value
Maximum file size	50 MB per file
Maximum request size	200 MB per request

What to do next

Supported Formats — Full catalog of accepted soil data formats.
API Reference: Soil Sampling — Endpoint reference for uploading files, checking status, and retrieving results.

​How it works

​Key concepts

​Output formats

​GeoJSON properties

​Analyte properties

​Common analytes

​GeoJSON example

​Canonical JSON structure

​File size limits

​What to do next

How it works

Key concepts

Output formats

GeoJSON properties

Analyte properties

Common analytes

GeoJSON example

Canonical JSON structure

File size limits

What to do next