Data User FAQ
Where can I access information about available estimate and survey data?
Information about available datasets can be found in the following:
Variable descriptions and formats can be found in the following files:
Survey Data: Why are there non-integer catch counts?
There are two ways that we end up with non-integer catch counts: 1) grouped catch and 2) incomplete shore mode trips.
In our standard estimation, type A grouped catches require a different sample weight from type B1 or B2 catches, which are always for individual angler-trips. However, we did not want folks to have to worry about two different sample weights when using the public-use datasets. In particular, having two sample weights complicates calculating combined (A+B1) landings. To avoid this situation and use only one sample weight, the claim counts (A) are multiplied by an adjustment for the records with grouped catch.
For shore mode assignments, we allow samplers to intercept incomplete trips under specific conditions. In these cases, anglers are asked to estimate the amount of additional time that they will continue fishing. This additional fishing time is used to expand the catch counts recorded during the interview.
Survey Data: What is the difference between PRT_CODE and LEADER?
The PRT_CODE and LEADER codes often have the same values, but they provide different information. The PRT_CODE is the ID_CODE of the party leader. A fishing party includes everyone that fished on the same boat trip. Within a party, there may be multiple groups with separate grouped catches. These groups each have distinct LEADER codes. Headboat trips generally have multiple groups with grouped catch (and therefore multiple leader codes) within the same fishing party (PRT_CODE). In the majority of private/rental (PR) and Charter (CH) trips, there is only one group in the party so PRT_CODE will equal LEADER.
Survey Data: How is grouped catch recorded?
If A catch is grouped, it will only be reported under the leader's ID_CODE. B1, B2, and all A catch that is not grouped, are reported separately by individual ID_CODE. We are working on a modified catch public-use dataset that will eliminate the grouped catch. That will greatly simplify a number of analyses.
All preliminary estimates will likely be revised before being posted as final. The direction and magnitude of such revisions are unpredictable.
The percent standard error, or PSE, is a measure of precision presented with all estimates. Estimates should be viewed with increasing caution as PSEs increase beyond 25.
Large PSEs – those above 50 – indicate high variability around the estimate and therefore low precision. Estimates with large PSEs should be viewed cautiously.
During the year, we produce preliminary estimates by sampling wave, mode of fishing, and state. These estimates – particularly at lower levels of aggregation – may be imprecise due to small sample sizes. For this reason, MRIP estimates are best viewed in aggregate - annually and at the state or regional level.
4. The Time Series
When comparing catch estimates across an extended time series, note differences in sampling coverage through the years. Some estimates may not be comparable over long time series. For more information about changes in our sampling and coverage, see Program Evolution.
5. Fish Weight estimates:
USE CAUTION WITH WEIGHT DATA
Fish weight estimates are minimums and may not reflect the actual total fish weight landed or harvested.
Fish weight Estimates Prior to 2004
Weight estimates were calculated by multiplying the estimated number harvested in a cell (year/wave/state/mode/area/species) by the mean weight of the measured fish in that cell. Sometimes we have an estimate of harvest but no mean weight, either because
- The harvest is all reported by the anglers (B1), or
- The interviewers couldn't weigh any fish (fish too big, already gutted and gilled, etc.).
If a cell is missing a mean weight, and if we have at least two fish measured in the state (all fishing areas and modes combined),
- We substitute the mean for the whole state for that wave
- We need two measured fish to get a variance estimate
After state substitution, if the mean weight is still missing,
- We use the mean from the whole subregion for that wave
- The "two fish rule" still applies
Fish weight Estimates 2004 to present
As part of the MRIP re-estimation project, all estimates of landings by weight (lb or kg) were recalculated using the same design-based estimation methodology used to recalculate the estimates of catch in numbers of fish.
During the MRIP re-estimation project, a new method was developed to handle missing weights as well. The new method uses a mix of hot and cold deck imputation as well as length-weight modeling to impute or fill in missing length or weight values by species at the individual angler-trip level.
For individual fish records where lengths are present, missing weights are imputed using length-weight modeling of the form Weight = a*Length^b. In most cases, models are fit by species and two-month wave in the current year. Should a model fail to converge, models are fit by species using the most recent 10 years of data.
For intercepted angler-trips with landings but no corresponding length and weight measurements, paired length and weight observations are imputed from complete cases using hot and cold deck imputation. Up to five rounds of imputation are conducted in an attempt to fill in missing values. The rounds begin with imputation cells that correspond to the most detailed MRIP estimation cells but are aggregated to higher levels in subsequent rounds to bring in more length-weight data:
- Round 1: current year, wave, subregion, state, mode, area fished, species
- Round 2: current year, half-year, subregion, state, mode, species
- Round 3: current + most recent prior year, wave, subregion, state, mode, area fished, species
- Round 4: current + most recent prior year, subregion, state, mode, species
- Round 5: current + most recent prior year, subregion, species
For All Years
If fish weights are STILL missing after all the imputation methods have been applied, we give up and leave a missing fish weight estimate. At that point,
- It is up to the user to determine whether to substitute, and
- What substitution is most appropriate to use (a mean from the preceding and following waves, the whole year, same wave over years, whole Atlantic & Gulf coast, or other model based approaches).
- We don't make those decisions because the information needs and sensitivity of the data vary among species.
The phenomenon of missing fish weights is more widespread with rarely caught species and with large fish (i.e. tunas). The existence and/or extent of missing weights for your query is provided in the column labeled “Landings (no.) without Size Information” in the weight estimates query output. This column provides the number of landed (A+B1) fish that are not included in the weight estimate column (labeled “Harvest (A+B1) Total Weight (lb or kg)”). If the “Landings (no.) without Size Information” column contains a 0 value, then all landed fish are included in the weight estimate.
Please review the Glossary for other important tips on using MRIP data.