An integrated data system for Federal Crop Insurance research • FCIP Calibrated and Synthetic Data Catalogue

Motivation

The Federal Crop Insurance Program (FCIP) is the most prominent component of the U.S. farm safety net: expenditures on premium subsidies increased ninefold between 2000 and 2022, by which point they accounted for nearly a third of total budgetary support to producers, drawing sustained scrutiny from budget analysts and policymakers even as the program remains central to producers’ risk management (Tsiboe et al., 2025). Evaluating a program of this scale confronts a fundamental data constraint: the primitives that economic theory places at the center of insurance demand and program evaluation (farm-level yield distributions, the menu of contracts a producer faced, and the joint distribution of yields and prices) are not directly observed in any public data source. Producer-level records held by the program, including yields, are explicitly exempt from Freedom of Information Act disclosure under the Federal Crop Insurance Act (7 U.S.C. § 1502(c)), and official yield statistics stop at the county. The U.S. Department of Agriculture’s Risk Management Agency (RMA) publishes rich administrative records, principally the Summary of Business (experience aggregates) and the Actuarial Data Master (the complete set of rating parameters), but these describe outcomes and rating inputs, not the underlying stochastic environment.

The data system documented in this collection of articles resolves this constraint by exploiting a structural property of the program: premium rates are deterministic, publicly documented functions of a contract’s production history. Because the rating function is known and invertible over the relevant domain, observed premium rates reveal the yield information producers reported to their insurers, a revealed-information argument analogous to recovering marginal cost from observed prices under a known pricing rule (Tsiboe, Turner, & Yu, 2025). Layering RMA’s own stochastic simulation methodology and menu-construction rules on the recovered yields then yields an internally consistent research infrastructure in which observed choices, counterfactual choices, and the stochastic environment are all defined on the same footing.

The four stages

The system comprises four sequential stages, each documented in a companion article and each released as a citable data collection.

Stage 1: Standardized transaction records. Insurance experience is assembled at the level of the aggregated transaction: the set of insured units sharing an insurance pool (county, commodity, type, and practice), a contract design (plan, coverage level, and unit structure), and a crop year. Revenue-based policies are converted to yield-equivalent terms so that all plans are expressed in a common metric, and the rating function is inverted to recover each transaction’s rate yield, the production-history summary statistic that generated its observed premium rate. See Preparing FCIP transactions for yield calibration.

Stage 2: Calibrated yields. A constrained minimum-distance procedure selects, for each transaction, the production-history configuration and yield realization that best reconcile the transaction’s premium rate with the rate observed for the same insurance pool in the following year. The resulting calibrated yields are benchmarked against official yield histories and validated in Tsiboe, Turner, and Yu (2025). See Calibrating sub-county yields from insurance transactions.

Stage 3: Correlated revenue scenarios. For every calibrated transaction, 500 joint yield–price realizations are generated under RMA’s own simulation methodology (the M-13 framework), preserving the agency’s distributional assumptions and correlation structure. Common random numbers across contract alternatives ensure that counterfactual comparisons isolate contract design from simulation noise. See Correlated yield–price scenarios under the M-13 framework.

Stage 4: Choice sets and synthetic panels. The feasible menu of FCIP contracts is reconstructed for each transaction’s year and location and every alternative is evaluated on the common scenarios, enabling revealed-preference and welfare analysis. A parallel branch attributes unit-level indemnities to causes of loss and assembles synthetic pseudo-producer panels locatable on RMA’s rainfall-index grids. See Reconstructing FCIP choice sets and Synthetic producer panels and loss attribution.

Design principles

Three principles govern the system. First, fidelity to program rules: wherever RMA publishes a procedure (rating, price discovery, simulation, subsidy schedules), that procedure is implemented exactly rather than approximated, so that discrepancies between model and data are informative about behavior rather than artifacts of methodology. Second, common support for counterfactuals: all alternatives, observed or hypothetical, are evaluated under identical stochastic scenarios, the simulation analogue of a within-subject design. Third, reproducibility: every collection is regenerable from public inputs, and all stochastic components carry explicit seed contracts under which any replication can be reproduced in isolation.

What the system has enabled

The collections documented here underpin a connected research program. The calibrated yields support sub-county analysis of the full spectrum of yield deviations, not only the indemnified tail, enabling the asymmetric-information analysis of unit-structure elections in Tsiboe, Turner, and Yu (2025). The common-scenario evaluation machinery underlies estimates that a one percent increase in FCIP-induced mean revenue is associated with a 2.25 percent reduction in inter-crop-year revenue variability, driven primarily by individual revenue and yield protection plans (Tsiboe et al., 2025); rankings of Title I and crop insurance program combinations by profit enhancement and risk reduction (Gaku & Tsiboe, 2025); prospective evaluation of a proposed buy-up Price Loss Coverage option, which paired with existing farm-based insurance reduces revenue variability by 23 percent relative to no risk management (Tsiboe & Turner, 2025); and evidence from over one million sub-county observations that low participation in supplemental coverage leaves substantial downside-risk protection untapped (Tsiboe, Biram, & Hagerman, 2026).

Scope and limitations

The unit of analysis throughout is the aggregated transaction, interpreted as a representative producer; farm-level heterogeneity within a transaction is not identified. Coverage extends to the continuously rated individual plans (Yield Protection, Revenue Protection, Revenue Protection with Harvest Price Exclusion, and Actual Production History) from 2011 onward, with area and margin plans entering at the choice-set stage. Administrative records are subject to revision by RMA, and collections are refreshed accordingly; analyses requiring strict replicability should archive the specific vintage used.

Data availability

All collections are distributed as annual files attached to versioned releases and retrievable programmatically; for example:

piggyback::pb_download(
  file = "calibrated_yield_2022.rds", dest = tempdir(),
  repo = "ftsiboe/USFarmSafetyNetLab", tag = "calibrated_yield")

The companion articles state the collection name and host for each data set. Collections hosted on the public repository download without credentials; collections hosted on the private repository (calibrated_revenue, menu_option, calibrated_menu) require an access token, available on request from the author (ftsiboe@hotmail.com) and supplied via Sys.setenv(GITHUB_PAT = "<token provided on request>") before downloading.

Recommended citation

Tsiboe, F. (2026). An integrated data system for Federal Crop Insurance research. In FCIP calibrated and synthetic data catalogue. https://ftsiboe.github.io/rfcipCalibrate/articles/data-overview.html

Data users should additionally cite Tsiboe, Turner, and Yu (2025).

Disclaimer

This product uses data provided by USDA/RMA but is neither endorsed by nor affiliated with USDA or the U.S. Government.

References

Coble, K. H., & Barnett, B. J. (2013). Why do we subsidize crop insurance? American Journal of Agricultural Economics, 95(2), 498–504. https://doi.org/10.1093/ajae/aas093

Gaku, S., & Tsiboe, F. (2025). Evaluation of alternative farm safety net program combination strategies. Agricultural Finance Review, 85(2), 254–273. https://doi.org/10.1108/AFR-11-2023-0150

Glauber, J. W. (2004). Crop insurance reconsidered. American Journal of Agricultural Economics, 86(5), 1179–1195. https://doi.org/10.1111/j.0002-9092.2004.00663.x

Tsiboe, F., & Turner, D. (2025). Incorporating buy-up price loss coverage into the United States farm safety net. Applied Economic Perspectives and Policy, 47. https://doi.org/10.1002/aepp.13536

Tsiboe, F., Turner, D., & Yu, J. (2025). Utilizing large-scale insurance data sets to calibrate sub-county level crop yields. Journal of Risk and Insurance, 92(1), 139–165. https://doi.org/10.1111/jori.12494

Tsiboe, F., Turner, D., Williams, B., Miller, M., Baldwin, K., & Dohlman, E. (2025). Risk reduction impacts of crop insurance in the United States. Applied Economic Perspectives and Policy, 47(5), 1832–1847. https://doi.org/10.1002/aepp.13513

Tsiboe, F., Biram, H., & Hagerman, A. (2026). Low participation and untapped benefits of supplemental crop insurance in the United States (working paper). Agricultural Risk Policy Center, North Dakota State University.

U.S. Department of Agriculture, Risk Management Agency (USDA-RMA). (n.d.). Summary of Business, Actuarial Data Master, and program information. https://www.rma.usda.gov