Tutorial

This tutorial walks through a typical workflow using USDAQuickStats.jl to query the USDA NASS Quick Stats database.

Setup

using USDAQuickStats
set_api_key("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")

Step 1 — Explore available field values

Before building a query it is useful to check what values are available for each field using get_param_values. This avoids guessing field values and getting empty results.

# What sectors are available?
get_param_values("sector_desc")
# ["ANIMALS & PRODUCTS", "CROPS", "DEMOGRAPHICS", "ECONOMICS", "ENVIRONMENTAL"]

# What source types are available?
get_param_values("source_desc")
# ["CENSUS", "SURVEY"]

# Find commodity names (returns a long list)
get_param_values("commodity_desc")

Step 2 — Check record count before querying

The API enforces a hard limit of 50,000 records per query. Requests exceeding this limit will fail. Always use get_counts first to verify your query is within the limit.

# Check how many records our intended query would return
count = get_counts(
    "source_desc=SURVEY",
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019"
)
# 957 — well within the limit, safe to proceed

If a query is too broad:

get_counts("source_desc=SURVEY", "year=2019")
# 448858 — exceeds the limit, needs to be narrowed down

Narrow it down by adding more filters until the count is below 50,000.

Step 3 — Query the database

Use get_nass to fetch the data. Arguments are "field=VALUE" strings. Spaces in values are handled automatically.

data = get_nass(
    "source_desc=SURVEY",
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019",
    "statisticcat_desc=AREA BEARING",
    "statisticcat_desc=PRICE RECEIVED"
)

get_nass returns a Vector{UInt8} — raw bytes that you can parse with any package you prefer.

Step 4 — Parse the results

JSON (default format)

using JSON3, JSONTables, DataFrames

df = DataFrame(jsontable(JSON3.read(data).data))

CSV format

using CSV, DataFrames

data = get_nass(
    "source_desc=SURVEY",
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019";
    format="csv"
)
df = CSV.read(data, DataFrame)

Save to disk

# Save as JSON
write("oranges_ca_2019.json", get_nass(
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019"
))

# Save as CSV
write("oranges_ca_2019.csv", get_nass(
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019";
    format="csv"
))

Using the DataFrames extension

If you have DataFrames, JSON3, JSONTables, and CSV loaded, a dedicated get_nass_df function becomes available:

using DataFrames, JSON3, JSONTables, CSV, USDAQuickStats

df = get_nass_df(
    "source_desc=SURVEY",
    "commodity_desc=ORANGES",
    "state_alpha=CA",
    "year=2019",
    "statisticcat_desc=AREA BEARING",
    "statisticcat_desc=PRICE RECEIVED"
)
# Returns a DataFrame directly — no manual parsing needed

Note that get_nass is unchanged and still returns raw bytes — get_nass_df is a separate function only available when the extension packages are loaded.

Common database fields

FieldDescriptionExample values
source_descData sourceSURVEY, CENSUS
sector_descSectorCROPS, ANIMALS & PRODUCTS
commodity_descCommodityORANGES, CORN, CATTLE
statisticcat_descStatistic categoryAREA BEARING, PRICE RECEIVED
state_alphaState abbreviationCA, TX, FL
yearSurvey year2019, 2020
freq_descFrequencyANNUAL, MONTHLY
agg_level_descAggregation levelSTATE, COUNTY, NATIONAL

For a full list of fields and valid values, use get_param_values or visit the API documentation.