Using DFCGI

Index


Introduction

Program DFCGI is a general purpose data retrieval and formatting tool for accessing data directly from the Bureau of Reclamation Pacific Northwest Region Hydromet and Agrimet system “DAYFILES” database in Boise. The DAYFILES database includes all of the “realtime” data collected from the remote sites, as observed generally in 15 minute increments. Executing the program from your browser will result in a query being sent directly to the data server, allowing access to the most current data available at the time of your request. The name DFCGI is from “DAYFILES Common Gateway Interface”.

Do not let the length of this document scare you, as this is a fairly easy program to use. You can learn almost everything you need to know simply by looking at the examples. The more detailed descriptions may be studied for a better understanding of this program and the database. If you read this entire document, you will qualify as a DFCGI Master!

There are two ways to use DFCGI

  • With a previously created configuration file
  • Interactively, specifying one or more parameters

    If you would like to use a form to generate your own DFCGI query, go to href=http://www.usbr.gov/pn/hydromet/dfcgi.html


    General Command Syntax

    To execute the dfcgi script, use your internet browser (such as Netscape or Internet Explorer) while connected to the internet and enter the address in the following form:

    http://www.usbr.gov/pn-bin/dfcgi.pl?parameter1=xxx&parameter2=xxx&parameter3=xxx.....

    The host is a Unix server, so the part of the command “pn-bin/dfcgi.pl”, which refers to a directory and file name, is case sensitive and must be in lower case. The rest of the command is not case sensitive. The parameters are described below. At least one parameter is required, and others may be added as needed except when a configuration file is used. Each additional parameter must be preceded by the “&” character.

    Here are two examples:
  • http://www.usbr.gov/pn-bin/dfcgi.pl?conf=boise Uses a predefined configuration file to generate a report for the Boise River basin in southwest Idaho.
  • http://www.usbr.gov/pn-bin/dfcgi.pl?site=hrsi&pcode=q Retrieves the flow of the Payette River near Horseshoe Bend, Idaho. See below for description of the parameters and ways to use additional parameters.


    Using a Configuration File

    A configuration file must first be setup by Reclamation in a specific directory on the Hydromet data server. This is the best way to get your data if you desire a report with more than one station or with specific, but not all, data parameters. There are a number of reports already configured, and custom reports to meet your needs can be setup without much trouble. If you need a special report, contact me directly.

    Syntax:
  • http://www.usbr.gov/pn-bin/dfcgi.pl?conf=filename
    or
  • http://www.usbr.gov/pn-bin/dfcgi.pl?cfg=filename

    where “filename” is the name of a previously stored configuration file. If a configuration file is used, no other parameters may be specified in the command. Some of the available configuration files are referenced in the list of “River Basin Reports” at http://www.usbr.gov/pn/hydromet/realtime.html.


    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?conf=boise


    Building Custom Reports

    A customized retrieval includes one required parameter (site or sta), plus one or more optional parameters. The parameters may appear in any order. If a parameter is repeated, the last one specified will be used. If an unrecognized parameter is specified an error message will appear and the parameter will be ignored, and in most cases the retrieval will continue successfully. The first parameter is preceded by the question mark “?” and any additional parameters must be preceded by an ampersand “&”. There is a limit of 256 bytes in the command but it is unlikely that your command will ever approach the limit.

    QuickStart Parameter Summary:
    site=cbttname or sta=cbttname Required, no default
    pcode=datatype or parm=datatype Optional, default “pcode=all”
    back=hours Optional, default “back=24”
    incr=minutes Optional, default “incr=15”
    time=hour_offset Optional, default “time=0”
    last=yyyymmmdd or end=yyyymmmdd Optional, default ends at current date and time
    form=format_name Optional, default “form=htmlt”

    Example utilizing all parameters, displaying hourly flood discharges from Scoggins Dam for 7 days in early April, 1991 in “SHEF-A” format:
    http://www.usbr.gov/pn-bin/dfcgi.pl?site=scoo&pcode=q&back=168&incr=60&time=-1&last=1991apr09&form=shefa

    Required Parameter:

    site=cbttname or sta=cbttname

    where cbttname is the string which identifies the site in our database. CBTT (Columbia Basin TeleType) names are coordinated among various Federal agencies in the Pacific Northwest. As an example, the cbttname “HRSI” refers to the site Payette River near Horseshoe Bend, Idaho”. A list of sites which may be found in our database is at http://www.usbr.gov/pn/hydromet/decod_params.html. The Corps of Engineers maintains a complete CBTT list, many of which will not appear in this database, at http://www.nwd-wc.usace.army.mil/ftppub/cafe/station_info.

    If no other parameters are specified, the report will include 15-minute data for the past 24 hours for all data types available at the site in a web-based table format.

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=hrsi

    Optional Parameters:

    pcode=datatype or parm=datatype

    where datatype is the code used in our database for the desired information. A list of datatypes available for each station can be found at http://www.usbr.gov/pn/hydromet/decod_params.html. If the datatype is not specified, all available datatypes will be retrieved. The most commonly used datatypes are:

    q Stream discharge, cubic feet per second
    gh Stream “stage” (water level compared to a local datum), feet
    qc Canal discharge, cubic feet per second
    ch Canal “stage” (water level compared to a local datum), feet
    fb Reservoir elevation, feet above mean sea level
    af Reservoir contents, acre-feet
    ob Air temperature, degrees F
    pc Precipitation, cumulative, in inches
    wf Water temperature, degrees F

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=hrsi&pcode=q

    back=hours

    where hours is the number of hours to include in the report, looking back from the current time or the specified ending time (see parameter “end”). The default value is 24 hours. The maximum number of time periods that can be retrieved is 1000, so if you are collecting the default 15-minute data values (see parameter “incr”) you could specify “back” as large as 250 hours (four values per hour times 250 hours equals the maximum 1000 time periods). If you specify a combination of “incr” and “back” that would result in more than 1000 time periods, the “back” parameter will be automatically adjusted to limit the number of periods.

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=hrsi&back=12

    incr=minutes

    where minutes is the number of minutes between data values. The default is 15 minutes. Most Hydromet parameters are observed every 15 minutes, but in some cases you may only wish to see hourly data or perhaps 8-hourly data in your report. Generally, the increment must be an even multiple of 15 minutes, such as 30, 60, 120, etc. The maximum number of time periods that can be retrieved is 1000, so if you specify an “incr=60”, you could specify “back” (see parameter “back”) up to 1000 hours, thereby obtaining over 41 days of hourly data. If you specify a combination of “incr” and “back” that would result in more than 1000 time periods, the “back” parameter will be automatically adjusted to limit the number of periods. Note: If you want just “midnight” values, you should probably be using one of our other data tools to collect data from the ARCHIVES database, instead of this “dfcgi” program.

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=hrsi&incr=120

    time=hour_offset

    where hour_offset is the number of hours between the site’s local timezone and the Mountain Timezone (where the data server is located). This parameter is not usually needed, but may be used for sites located in the Pacific Timezone, where the hour_offset would be “-1”. This generally has no effect when the optional parameter “end” is used. Normally (unless “end” is specified), the retrieval includes data up through the current time on the server (Mountain Time). If data are retrieved from a site in the Pacific Timezone, specifying the “time=-1” parameter will result in a table that goes up through the current Pacific Time, one hour earlier. I may modify the program in the future to automatically set this parameter based upon the site, but currently you may set this manually if desired.

    This has no effect upon the times shown for the data values, only upon the ending time of the data table that is retrieved. The time shown for a value is ALWAYS the correct local time, adjusted for standard or daylight savings time, at the actual remote data site. In other words, the time would reflect what your watch says at the location.

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=mfdo&time=-1

    last=yyyymmmdd or end=yyyymmmdd

    where yyyymmmdd is a date string in the strict format shown, such as “2004jan09”. This specifies the last day to be retrieved, before midnight. Note: “midnight” values are actually stored in our database as hour “0:00” of the next day… the last data value usually available for a site in a specific day is “23:45”, or 11:45 p.m. If not specified, data will be retrieved up to the current time in Mountain Timezone, or other time zone if changed with the “time” parameter. You can use the “last” parameter to retrieve historical data, rather than the current data. In the following example, we retrieve 216 hours (9 days) of hourly cumulative precipitation ending at 11 p.m. of April 9, 1991 at Scoggins Dam, near Beaverton, Oregon (this was a flooding event):

    Example: http://www.usbr.gov/pn-bin/dfcgi.pl?site=scoo&pcode=pc&incr=60&back=216&last=1991apr09

    form=format_name

    Where format_name is one of the supported output formats described below:

    form=htmlt (this is the default)

    Generates a report which has imbedded HTML tags for convenient viewing with your web browser. The actual data values are presented in an HTML “table”.

    form=excel

    Generates a report which has values presented in an HTML “table” without other unnecessary headings. Note – this format can be easily imported into Excel® spreadsheets. See the section in the Frequently Asked Questions, below, for assistance with importing the data.

    form=html

    Generates an ASCII text report using a fixed pitch font, similar to an old fashioned computer printout. Columns of data values are formatted with null (space) values as needed to align them. The results are presented with minimal HTML formatting between a header and trailer, with the actual data displayed as “preformatted” text.

    form=dayf0

    This format emulates the output of one of our legacy data access programs. There is one data value per line, with timestamp. It is presented as fixed-pitch, space-delimited ASCII text as preformatted HTML.

    form=shefa

    This is very similar to format dayf0, but the data lines are formatted differently, using the SHEF (NWS Standard Hydrometeorological Exchange Format) “SHEF-A” format.

    form=shy2k

    This is a SHEF-A format, as above, but with the year presented as a “y2k” compatible 4-digit number.


    Frequently Asked Questions (FAQ)

  • 1. What data sites are available in the database?
  • 2. Who is responsible for the accuracy of the data?
  • 3. How do you setup a configuration file for a special report?
  • 4. How do you get daily average flows and other daily data?
  • 5. Why am I limited to retrieving 1000 time periods at a time?
  • 6. Are you watching my retrievals?
  • 7. How can I get data to go into my Excel spreadsheet automatically?
  • 8. How do I automatically retrieve data on a scheduled basis?
  • 9. What are data “flags” that appear sometimes?
  • 10. What does the value “998877.” mean?
  • 11. Why does the data seem to be a few hours behind at times?
  • 12. Why am I getting an empty report?

    1. What data sites are available in the database?

    The “DAYFILES” database includes all real-time data collected since the beginning of the Hydromet system in the early 1980’s. Sites were added (and deleted) through the years. The period of record varies by site, too. Unfortunately, there is not a tool at this time to help identify the period of record for each site, but I’ll be working on that. The database includes sites in the Pacific Northwest that are of interest to the Bureau of Reclamation. Most of them are owned and operated by us, but we also collect and process some data from sites operated by other agencies including the US Geological Survey, the Corps of Engineers, the Idaho Department of Water Resources, the Oregon Water Resources Department, the Idaho Power Company, and probably a few others. A site list is available at http://www.usbr.gov/pn/hydromet/decod_params.html.

    2. Who is responsible for the accuracy of the data?

    You are. Please view our disclaimer: http://www.usbr.gov/pn/hydromet/disclaimer.html. Automated procedures process the data as they arrive and many of the parameters are screened for data values that are out of normal ranges. Those values may be “flagged” as suspect, but we cannot assure that all bad values are identified. Generally, we do not edit the data in the DAYFILES database. The numbers you find here are our “raw data values”. Yours may be the first human eyes to view some of the data! We collect the data for our own purposes and make the data available to you as a courtesy, not because of a mandate. Use the data at your own discretion.

    3. How do you setup a configuration file for a special report?

    You can’t do it yourself,we need to do it for you. It is helpful if you already know what sites, data types, and format that you desire.

    4. How do you get daily average flows and other daily data?

    Daily average flows and stages, midnight reservoir elevations and storages, and many other daily summary parameters are stored in a separate database called ARCHIVES. You need to use one of the other data retrieval programs such as webhydarcread or yearrpt to retrieve the data from ARCHIVES.

    5. Why am I limited to retrieving 1000 time periods at a time?

    It is impractical to allow access to larger blocks of data because of the time it takes to process the request and the size of the file that needs to be returned to your web browser. It ties up our server and yours, and you may give up and terminate the process before it finishes. Also, it is for our protection against hackers who may wish to overload our server and perform a “denial of service” attack on us. If you really need a full period of record for multiple years, please contact us and request that we send you a disk or CD with the data on it. In some cases, if you have broadband internet, we may be able to email a compressed file to you as an attachment.

    6. Are you watching my retrievals?

    Yes! We log all data requests that come in through this program, including your IP address and the full data query. We do this to better serve you. By viewing the log, I can tell where to place priority when revising programs to be more efficient. We also use the logs for our protection, by detection of attacks against our server by hackers. Users who appear to be abusing our server may be blocked from further access. The information that we collect is only used internally, does not contain personal identifiers, and will only be used for internal management purposes, including security.

    7. How can I get data to go into my Excel® spreadsheet automatically?

    You can use this program to import data directly into your Excel® spreadsheet. Go to the "Data" menu, select "Import External Data" and "New Web Query". In the Address box, enter the full query, as described above, for the data that you wish to import. The default format "htmlt" produces a data table with HTML "TABLE" tags which can be directly imported, however the time and date strings may not be parsed as a "time" variable in the cells. There is a special format, "form=excel", that you can use which produces an Excel®-compatible time string and shows just the output data table without the clutter of extra headings.

    After the Address has been entered, the results will appear. Select the data table by clicking on the yellow arrow, then click "Import". The data will appear in your spreadsheet, and you can generate graphs and make other calculations using the data cells. You can save the spreadsheet with the web query, and click on the red exclamation point in the "External Data" window at any time to refresh the query.

    Excel® is a registered trademark of the Microsoft Corporation.

    8. How do I automatically retrieve data on a scheduled basis?

    Several people run automatic secripts that access our data on a periodic basis to update reports or databases on their own servers. When I figure out how they do it, I'll describe the process here.

    We have the right to block users who we feel are over-working our data server to no benefit. You can avoid this problem by understanding the updating cycle for the sites that you are interested in. Most sites report their data to the database every four hours, so it generally is not required to run an automatic data script more frequently than that. Please be reasonable in the use of this server.

    9. What are data “flags” that appear sometimes?

    As the incoming data are processed in our software, certain quality tests are performed in an attempt to catch values that are obviously in error or in some other way questionable. If such a value is detected, it is marked with a data "flag" in our database. Many of the reports that are generated with the data (but not all) will carry the flag along. It is in the form of a single character, usually appearing very near the data value. If you see a flag, it does not necessarily mean that the value is bad (or good), just that it should be examined more carefully for reasonableness. Not all bad values will be flagged. Remember that the user is responsible for all uses of this data (see our disclaimer).

    The most common data flags are:

    10. What does the value “998877.” mean?

    Sometimes you may see the value 998877. in your report. That is our “null value”, and indicates missing data. Often, the missing values are replaced with a dash, depending upon which output format you select, but sometimes you may see the 998877. You should ignore those values, and treat them as if the data does not exist.

    11. Why does the data seem to be a few hours behind at times?

    At the remote sites, the sensors collect data every 15 minutes (in most cases) and store the data in a local log file at the site. On a scheduled basis, every 4 hours or one hour depending upon the site, the remote logger uploads the information to the GOES satellite and the data are received and processed immediately and stored in the database at the Bureau of Reclamation Direct Readout Ground Station (DRGS) in Boise. Therefore, there may be a delay of up to 4 hours between the time that the data is collected and when it is available on our server. How “fresh” the data is when you access it will depend upon when the last scheduled satellite transmission occurred for the site.

    12. Why am I getting an empty report?

    The most likely cause of an empty report is an invalid parameter in your request, such as a site “cbttname” or data type that does not exist. Check your input and try again. It is also possible that a malfunction at a remote site has caused gaps in the database.