top of page
  • Tyler

Research Update

Updated: Sep 26, 2019

I've spent the past several days getting relatively simple machine learning models running on the stream flow data. We've processed data from several sources that includes stream flow, precipitation, and snow data, but I've decided to focus my initial efforts on the stream flow data to make sure I understand it before further complicating the problem with additional data. One key assumption I wanted to validate was that there would be significant relationships between stream flow stations on the same stream. For example, if one station measures a significant increase in stream flow then a station further downstream would be likely to measure an increase in stream flow a few hours in the future. If this is the case then we can improve predictions of stream flow at a particular station by incorporating information from stations that are upstream.


However, I've now implemented several simple models and so far it looks like predictions are only slightly improved by incorporating stream flow measurements from multiple stations rather than one. This pretty strongly violates my expectations, so at this point I'm spending time digging into the data to try to make sure that the sorts of relationships I expect to be found in the data set actually exist. So far the results of that investigation are mixed. I think there is a good chance that the lack of performance gain from incorporating data from multiple stream flow stations could be an artifact of the pre-processing or details of the problem definition. For example, stations measure stream flow on 15 minute intervals but I'm aggregating measurements into non-overlapping 3 hour windows which may be too coarse of a temporal resolution. There are a several other small decisions like this that could have a significant effect on the relationship between the stations that measure stream flow.


A major component of the project as I've currently formulated it is modelling relationships between locations where data is collected to improve predictions. If it turns out that relationships between stream flow stations don't improve predictive performance then that would have an impact on the project.

4 views0 comments

Recent Posts

See All

Final Blog Post

While in China I did my best to dive into Chinese culture and develop an understanding of Chinese values. Though I am by no means an expert I left with a few takeaways a few of which I'll attempt to s

What I've Accomplished This Summer

My goal this summer was to develop a novel machine learning approach to streamflow prediction. Previous work has explored machine learning approaches to this task but they tend to use off-the-shelf me

Trip to Beijing

Our trip to Beijing was wonderful and exhausting. Over the past few weeks I've been listening to a series of lectures about the history of China over the course of 19th and 20th centuries and was plea

bottom of page