Build crop type maps for smallholder farms
Step-by-step guide on how to build crop masks using machine learning models
Problem statement
The accurate and efficient classification of crops grown on smallholder farms in Africa and the Global South is a significant challenge compared to developed nations. This is primarily due to factors such as:
- Data scarcity: Limited availability of high-quality training data, particularly for diverse crop varieties and growing conditions specific to these regions.
- Image variability: Variations in crop appearance due to factors like soil type, climate, and cultivation practices, making it difficult for traditional classification models to generalize.
- Lack of infrastructure: Limited access to advanced technologies and infrastructure, hindering the deployment and maintenance of sophisticated classification systems. Economic constraints: Resource limitations and financial constraints often restrict the adoption of advanced agricultural technologies.
Crop circles in Kansas, USA using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER)
Coffee farms in Nyeri, Kenya using Sentinel-2
Rice fields in Vietnam along the Mekong Delta Sentinel-1
The accurate and efficient classification of crops grown on smallholder farms in Africa and the Global South is a significant challenge compared to developed nations. This is primarily due to factors such as:
- Data scarcity: Limited availability of high-quality training data, particularly for diverse crop varieties and growing conditions specific to these regions.
- Image variability: Variations in crop appearance due to factors like soil type, climate, and cultivation practices, making it difficult for traditional classification models to generalize.
- Lack of infrastructure: Limited access to advanced technologies and infrastructure, hindering the deployment and maintenance of sophisticated classification systems. Economic constraints: Resource limitations and financial constraints often restrict the adoption of advanced agricultural technologies.
Crop circles in Kansas, USA using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER)
Coffee farms in Nyeri, Kenya using Sentinel-2
Rice fields in Vietnam along the Mekong Delta Sentinel-1
The inability to accurately classify crops in these regions has several negative consequences, including:
- Yield reduction: Misidentification of crops can lead to suboptimal farming practices, affecting crop yields and farmer incomes.
- Pest and disease control: Incorrect crop identification can hinder the effective management of pests and diseases, resulting in crop losses.
- Market access: Accurate crop classification is essential for accessing markets and obtaining fair prices for agricultural products.
- Agricultural policy: Lack of reliable crop data hinders the development of effective agricultural policies and interventions.
To develop robust and scalable crop classification models that can accurately identify crop varieties in smallholder farms in Africa and the Global South, despite the challenges posed by data scarcity, image variability, and limited infrastructure.
Requirements and output
- Satellite imagery from Sentinel-1 and Sentinel-2 (10m resolution required for monitoring smallholder farms)
- Georeferenced crop type labels (ground truth)
- Existing land cover maps
- Crop calendar for temporal and crop growth cycle
- Satellite imagery from Sentinel-1 and Sentinel-2 (10m resolution required for monitoring smallholder farms)
- Georeferenced crop type labels (ground truth)
- Existing land cover maps
- Crop calendar for temporal and crop growth cycle
- Crop mask: A shapefile layer (vector or raster) showing the spatial extent of the crop in an area
- Evaluation metrics: A report summarizing the accuracy of the crop mask
- Crop type model: Trained machine learning model for future mapping activities
Step-by-step workflow
Create agroecological zones.
Using soil type, altitutude, rainfall, soil moisture, and temperature data to create homogeneous zones that can subdivide the broad area of interest into smaller zones based on their agroecology. Agroecological zones are also referred to as AEZ.
Create a stratified sample based on the crop type.
Determine the AEZ that grow different crops and their distribution using historical data. Distribute a number of samples to the specific AEZ that grow a certain crop. It is recomemend to have at least 30% sampling. Within each AEZ, randomly select farm sasmples from each AEZ where data will be collected from.
Create a region-based crop calendar based on the agroecological zones.
Using climatic variables, such as rainfall, crop season, and temperature data, create crop calendars, which contain the planting date, harvesting date, and the time periods for each growth cycle. These crop calendars will be a precursor to crop stage determination and satellite imagery selection for crop type mapping in later steps.
Extract band information and vegetation indicators from satellite imagery.
Extract bands and vegetation indicators from Sentinel-1 and Sentinel-2 satellites. Carry out dimensionality reduction to remove redundant bands and indices.
Create positive and negative labels from the dataset and ground truth data.
Use unsupervised learning (e.g., k-means) on the satellite imagery. Overlay ground truth data of the unsupervised classes. Determine the classes and the ground truth data the intersect, and then determine the frequency. THe classes that have few-to-none pixels intersecting will be labeled as negative pixels (i.e., -ve). The ground truth data will be used as positive labels.
Train a model and evaluate.
Use random sampling from the negative pixels (i.e., all other land cover classes) and positive pixels (i.e., crop types). Divide the data into training and test datasets. Carry out multi-class classification. Train the crop type model. Carry out model evaluation and report the findings. Apply the trained model to classify datasets from Sentinel-1 and Sentinel-2 using the relevant band and indices.
Optional: Adapt for other crops and regions of interests.
Use transfer learning to adapt the model to other crops and regions.