Data science tills the fields

Purdue College of Engineering
Purdue Engineering Review
5 min readFeb 17, 2022

--

Nathan Sprague (left), a senior in agricultural and biological engineering (ABE) at Purdue, and John T. Evans IV, assistant professor of ABE, test an autonomous sensing robot in a corn field at Purdue’s Agronomy Center for Research and Education (ACRE). (Photo credit: Purdue University/College of Agriculture/Tom Campbell)

Agriculture today depends on a lot more than water, weather, soil and seed — now you can throw data into the mix. Data science has become increasingly important as we strive to make sustainable, data-driven decisions in modern precision agriculture. Technology enablers like sensor deployment and farm management information systems provide the big data underpinnings that can lead to better decisions, helping to manage the ever-present uncertainties that govern the agricultural sector.

These technology enablers are powerful elements of a digital agriculture solution, but most people don’t realize the true state of data in agriculture today, which perhaps can best be described as “disparate.” As we gather ever-increasing amounts of this data and analyze it, there is a challenge that must be met, and met soon: interoperability — the ability of technology devices and computer systems that are key to digital agriculture to readily exchange their data so it can be processed, analyzed and used.

To a great extent, interoperability determines what we can do now and what researchers and developers are dreaming of, which are quite different. A September 2021 Issue Report from the Farm Foundation highlighted the challenge. “While digital agriculture innovations have enabled data collection at unprecedented scale, the ability to move data between devices and systems remains a sticking point across the industry,” the report said. “As a result, much of the food and agricultural data collected remains in silos, on the farm or separated by organization and industry.”

The resultant data bottlenecks, the report said, “limit innovation and efficiency throughout the supply chain, contribute to an AgTech environment where many digital solutions have unclear value for adoption, and ultimately obscure the full picture of agriculture’s impact and potential.”

New educational courses and approaches can play a vital role here, in teaching data science for agriculture. For example, Purdue University has a Data-Driven Agriculture minor to help students and practitioners take advantage of advances in sensing, communications, and computation technologies in farming. Agricultural professionals increasingly will be using data on soil, topography and weather to guide data-driven and more precise decision making. Course topics include data science for agriculture, sensors and process control, remote sensing of land resources, agricultural marketing and price analysis, precision crop management, and global environmental issues.

Matt Rogers, a PhD student in ABE, and Kelly Lewis of Oklahoma State University, a visiting undergraduate researcher, examine a test chamber where stereo vision camera reliability and accuracy for measuring distances was tested in assorted lighting and dust conditions. (Photo credit: Purdue University/College of Agriculture/Tom Campbell)

We’re also running a summer data science program. This past summer, 18 undergraduate students from 13 institutions came to Purdue for a 10-week experiential program focused on digital agriculture. The goal was to bring together students from diverse backgrounds and majors to explore applications in technology around data science in agriculture. Participants were pursuing degrees in such areas as physics, mathematics, biology, agricultural and biological engineering, agricultural systems management, and agricultural economics.

Classroom modules introduced the students to data science, and how coding, software and other tools can be used in agriculture. We covered software like Excel, ArcGIS, Python and R. Each student took on an independent project, regarding themes like the agricultural data pipeline, data wrangling, and decision making. Students got to drive a tractor and operate a combine harvester simulator and participated in other tours and activities. Participating faculty included myself; Dharmendra Saraswat, associate professor of Agricultural and Biological Engineering (ABE); James Krogmeier, professor of Electrical and Computer Engineering, with an appointment in ABE; and Mark Ward, professor of Statistics, with an appointment in ABE.

Interestingly, while competencies in coding in such programming languages as R and Python to wrangle data and gain insights certainly are important, agricultural practitioners will no doubt use “no-code” approaches most of the time. Farmers and their advisors and service providers need to know the power of code, but it is impractical to expect them to do much coding while also being versed in the agronomy, animal science, facilities, machinery, and economics knowledge they must apply in their roles.

In some of my circles, we have a saying: “Spreadsheets aren’t going away.” Many view spreadsheets as antiquated and simplistic, merely basic displays of CSV data (CSV stands for comma-separated values — simple text files, in which the data is separated by commas). Those folks don’t know the power of spreadsheets. Yes, they are not good for big data, but data at the firm level generally is complex in structure and interoperability — not complex due to size. One of my visions for agriculturalists is for them to simply think quantitatively more often. In my opinion, spreadsheets are the best on-ramp for that.

In education, we need to promote more of this quantitative, algorithmic thinking, and an awareness of mathematical possibilities and probabilities. We need to work hard toward interoperability, because we cannot really capitalize on artificial intelligence and machine learning — and their ability to sort through mountains of data, learn from it, and suggest strategies — until we have fuller contextual metadata.

My vision for an optimal future in applying data science to digital agriculture is for autonomous operation — for data to get from where it is to where it needs to go, on its own. The data should be interoperable, both human- and machine-readable. Then educated agriculturalists and data scientists can align the data to work with, refine, calibrate, and parameterize biophysical models — mechanistic, descriptive equations and sets of equations that explain physics, chemistry, physiology, biology, etc.

We need to improve this “model” mindset not only in digital agriculture, but across all disciplines — including clinical informatics in healthcare and digital twin in manufacturing. We ought to want to have “the equation for that,” rather than just a table of numbers that we must interpret, with the variability it entails. Those tables have value, but they should not be the “go-to” and “end-all” in every instance and decision, as data science matures into a more vital and value-added role in all human endeavors.

Dennis R. Buckmaster

Professor of Agricultural and Biological Engineering

Dean’s Fellow for Digital Agriculture

Department of Agricultural and Biological Engineering

College of Engineering

Agricultural Response Systems Co-Lead, NSF Engineering Research Center for the Internet of Things for Precision Agriculture (IoT4Ag)

Faculty Team Member, The Open Ag Technology and Systems Center (OATS)

Purdue University

--

--