Data structure and format
All monitoring results shared on DataStream are reported in a common format, making the data easier to interpret, integrate and use.
Data on DataStream are uploaded, stored and shared in DataStream’s open data schema (DS-WQX), which specifies required fields and allowed values for consistent vocabulary.
This data format is based on the WQX standard for the Exchange of Water Quality Data, which was developed by the US Environmental Protection Agency (EPA) and the US Geological Society (USGS), and is one of the most widely adopted standards for water data in North America.
The standard enables diverse monitoring entities to share results in a common format, and prevents missing or ambiguous information that can reduce data quality and impede data use.
Adhering to this common standard makes data easier to interpret, integrate and use – even when collected across different sectors and jurisdictions.
Getting oriented with DataStream's data format
What is a data schema?
A data schema defines how data are organized and reported. This includes specifying required fields of information that are used to describe a given data point (i.e. a given monitoring result or observation), as well controlled vocabulary (or “allowed values”) for how information in those fields is reported.
- The Allowed Values tab outlines the data format used by DataStream, including which columns (or fields) are required, optional or conditional, and how the values in each column should be entered (e.g. number, free text, allowed value list). This tab also includes a list of the “allowed values” for each column.
- The Glossary tab describes all of the column names and provides definitions for many of allowed value terms.
- The CharacteristicName LOOKUP tab provides a list of all the water quality parameters (characteristics) accepted on DataStream and how they should be entered. You can also see which parameters require additional information to be entered (like sample fraction, and method speciation).
Adding new characteristics
DataStream’s data schema continually evolves to meet the needs of the water community. If a water quality characteristic (parameter) that you measure is not included, please contact us about adding it to the Allowed Values list. If the characteristic is not included in the WQX standard, we will submit the request to the US EPA as well, so our standards remain aligned.
Formatting your data for upload
The worksheets in DataStream’s Upload Template can be used to prepare your data for upload (learn more in the Data upload resources section). If you have a large volume of data to convert into DataStream’s data format, it may be more practical to use tools like R or Python to speed up the process.
We recommend setting up a call with a Data Specialist on our team who can provide assistance and help determine the best approach. Contact us: team@DataStream.org
Have questions, requests, or need help with your data?