COPEPOD logo     COPEPODITE (title text)
An online plankton time-series analysis and visualization toolkit.

Data Format Instructions:

General Data Layout and Format Concept:

COPEPODITE will read incoming time-series data in a simple,keyword-labeled, comma-delimited ("comma separated values", CSV) format.   This format can be generated by Excel and many other database/statistical/spreadsheet programs.   The COPEPODITE data format consists of a single row of "column headers", the gray-colored first row in the table shown below, followed by multiple rows of date and/or data values.

Format Rule #1:   The very first row must contain the column labels (e.g., DATE-YMD, ABND=, BIOM=, TEMP=, ...).

Format Rule #2:   The first column must be a date column (e.g., DATE-YMD, DATE-MDY, or DATE-DMY).

Format Rule #2a (temporary):   The second column must be either an "ABND" or "BIOM" data type.
This is a temporary rule, active only as long as you see this notice..   (This rule is because the original data system only expected plankton values (ABND=abundance, BIOM=biomass) to appear first, rather than temperature (TEMP=) or chlorophyll (CHLA=) or something else (LOTH=, OTHR=).   This will be fixed in the very near future.)


Formatting Hints, Tips, and Tricks:

  • It is okay to leave an empty data value cells in the spreadsheet (e.g., yellow boxes below).   These do not need to be filled in with anything.   The DATE cells should not be empty, however.
  • If you place a "#" character in the very first column of a row, the software will ignore that entire row.   (See the "# Comments" and "#2005-11-05" light blue column below.)   In these cases, the entire row is completely ignored by COPEPODITE.
  • If you place a "#" as the first character in data value cell, it will ignore just that one individual measurement (i.e. the light green "# lost sample" and "# -100.25 (??)" examples below).
  • The purpose of the "#" ignore option is to allow you to quickly "remove/restore" anomalous values from a data file without permanently removing them.   If you remove the value instead, it will be harder to restore it in the future.   This on/off toggle lets you play around with large/small values to see how they influence the analysis results.


    ... ... ... Data Format instructions continue after the table below ... ...
    [
    go back to the top ]

    DATE-YMD BIOM= Total Wet Weight (mg/m3) DATE-MDY ABND1= Calanus finmarchicus adults (#/m3) ABND1= Calanus finmarchicus C3-C5 (#/m3) DATE-DMY TEMP= Temperature (C) at 50 meters
    # Comments Any comment or text here   Any comment or text here Any comment or text here   Any comment or text here
    2005-10-27 532 Oct/26/2005 105 352 21_10_2005 19
    2005-10-28 400 Oct/27/2005   400 25_10_2005 21
    2005-10-29   Oct/28/2005 217 183 26_10_2005 20
    2005-11-01 319 Oct/29/2005 50 117 03_11_2005 17
    #2005-11-05 1010 Oct/29/2005 173 532 05_11_2005 19
    2005-11-07 # lost sample Oct/29/2005 173 # -100.25 (??) 05_11_2005 19
    2005-11-17 971 Nov/03/2005 140 817 07_11_2005 18
    [
    go back to the top ]

    DATE columns and Date Formats:

    The current version of COPEPODITE has base time unit is "monthly means".   This means that if you provide weekly or daily data, it will be binned and averaged into a single monthly value for that month of that year.   (Monthly bins will soon be expanded to optional "weekly means"x, to accommodate the WG137 and WGPME phytoplankton data needs.   At that point, you will be given an option to select either monthly or weekly bins in the COPEPODITE processing menu.)

    It is quite common to have different sampling intervals (dates) for the plankton than for the other variables.   You might have monthly or seasonal zooplankton date, weekly chlorophyll data, and perhaps daily temperature data.   COPEPODITE will read all of these and correctly match up and synchronzie the data for you (currently to monthly bins).

    In the example table above, the biomass data and abundance data and temperature data all have different sampling dates (and formats).   The COPEPODITE software will bin them into monthly means and then synchronize them into matching month+year date sets.   (This means the temperature data, although taken on different days, will sync with the corresponding biomass and abundance data for that month!).

    Format Rule #3:   Each column of data values is assigned to belong to the first date column found to its left.   You can have a single date column with 15 data columns following it, you can have 15 paired "date + data" column pairs, or any mixture you desire.   The general idea is to make your set up of the data file as easy as possible.

    Format Rule #4:   COPEPODITE currently only recognizes three general data formats:

    Keyword Order Examples
    DATE-YMD Year + Month + day 2010-Mar-15    2010/03/15    2010_03_15    2010.Mar.15
    DATE-DMY Day + Month + Year 15-Mar-2010    15/03/2010    15_03_2010    15.Mar.2010
    DATE-MDY Month + Day + Year Mar-15-2010    03/15/2010    03_15_2010    Mar.15.2010

    • The delimiter between the three categories can be any of the following: "-", "_", ".", "/".
    • Three character (English) months are also recognized (e.g., "Jan", "Feb", "Mar", "Apr", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), with capitalization ignored (e.g., "JAN" = "jan" = "Jan" = "JaN" = "jAn" ).
    • You must provide a month and day for each data value.   If you have monthly data with no day, use a day value of "15" (i.e. "15-mar-2010").   If you have annual or "once per year" data, you can use June 15th as the month and day for each year.   For "once per season" data, select a month that best represents the season or sampling period.

    Format Rule #5:   Your data format must remain consistent within each individual date column (ie, do not switch from YMD to MDY or DMY within a single date column). You may have multiple different date format COLUMNS in the spreadsheet (like the table above).

    [
    go back to the top ]

    DATA (variable) columns Types:

    All COPEPODITE data columns must be labeled with four character + "=" variable indicator (e.g., "BIOM=", "ABND=", ...).   The text immediately following that indicator keyword can be whatever you desire and will be used to describe the variable in column (e.g., "BIOM= Total Wet Mass (mg/m3)", "BIOM= Total Dry Mass (mg/m3)").   This "text to the right of the keyword" will be used in all of the plots and figures showing the variable.

    Please make that your variable text does not contain any commas (e.g., "Calanus finmarchicus, adults, , female").   These extra commas will corrupt the comma-separated-values data format and cause your data to fail the preview step.   Also, at this time please do not include any Greek or mathematical symbols (e.g., the degrees symbol or the "u" ("mu" or micro) symbol), as they will cause the header text to get cut short (the rest of the description text will not be shown in the plots).

    Format Rule #6:   COPEPODITE currently only recognizes these Variable Indicators:

    BIOM= Use this for total [zooplankton] biomass values.
    ABND= Use this for zooplankton or phytoplankton abundance values.
    CHLA= Use this for chlorophyll (or pigment) values.
    TEMP= Use this for temperature values.
    PSAL= Use this for salinity values.
    LOTH= Use this for NUTRIENTS and other miscellaneous values.
    (These values will go through the log10 processing method.)
    OTHR= Use this for Oxygen(?) and other miscellaneous values.
    (These values are NOT log10 transformed during processing.)

    Format Rule #7:   Your data values must NOT already be "log transformed".   Per the WG125/WGZE analysis method, all biological values will be log10 transformed during the processing steps.   If your data are already log transformed, and you can not easily reverse them, you can try using the "OTHR=" header to at least visualize the values, but the results will not be 100% identical to the WG125/WGZE method results.

    [
    go back to the top ]

    Special Grouping Options:

    This option is currently in testing mode.   Listed are the "April 2011" grouping rules, which may change in a later update.
    Your suggestions or comments are welcome.

    Currently the COPEPODITE "group plot" is focused on allowing a user to plot multiple years of taxa (ABND=) or nutrient variables (LOTH=) in a single plot.   This is done by identifying the group membership with a one or two digit number (e.g., the "ABND1=" column header in the example table earlier in this document indicates "Group 1").   Any taxa data associated with "ABND1=" will be plotted in a separate graph from "ABND2=" (e.g., so you could plot "copepods species" using ABND1= and "diatom species" using ABND2= ).   Currently, different variable types do not cross-group (e.g., "ABND1=" and "BIOM1=" are treated and plotted as completely different groups.)

    Group plots are shown in both "log10" and "raw" value format (see examples below), because the results and usefulness of either plot depends on the data, the distribution of values, and what you are looking for within the data.   Below are examples of the same data plotted in both "log10" and "raw" format.





    [
    go back to the top ]

    FAQ Section:

    This section is under construction.   Please email Todd directly with any questions or problems (Todd.OBrien@noaa.gov).




    KNOWN PROBLEMS:

      Date Formatting Related:
      • European vs. United States "Excel" Date Formating:   European versions of Excel appear to default dates to "day-month-year" format.   If you open up a CSV containing this date format with a United States version of Excel (default date format of month-day-year), the "day-month-year" dates will be partially and incorrectly converted as if they were "month-day-year".   What this means is that any day equal of less than "12" will be read as a month (i.e., 03/12/2005 is read as March-12-2005, even though the intended data was December-03-2005), and the remaining dates will be left as "junk text".   The end result is that the date column will be completely messed up.   One possible solution:   Open the CSV file with a TEXT editor (i.e. Windows "notepad" or "wordpad") and replace all of the "/" date characters to "_" (e.g., "03/12/2005" becomes "03_12_2005").   This will make Excel ignore and not try to convert the dates, yet COPEPODITE will still be able to correctly read in the date information.




    [
    go back to the top ]