Rich Signell's Notebook Blog

Blogging with IPython Notebooks

Extract Water Level Time Series from NECOFS forecast model

Demonstration using the NetCDF4-Python library to extract time-series of sea surface height from a triangular grid ocean model (FVCOM) via OPeNDAP, specifying the desired URL, time range, and lat/lon locations of interest.

NECOFS (Northeastern Coastal Ocean Forecast System) is run by groups at the University of Massachusetts Dartmouth and the Woods Hole Oceanographic Institution, led by Drs. C. Chen, R. C. Beardsley, G. Cowles and B. Rothschild. Funding is provided to run the model by the NOAA-led Integrated Ocean Observing System and the State of Massachusetts.

NECOFS is a coupled numerical model that uses nested weather models, a coastal ocean circulation model, and a wave model. The ocean model is a volume-mesh model with horizontal resolution that is finer in complicated regions. It is layered (not depth-averaged) and includes the effects of tides, winds, and varying water densities caused by temperature and salinity changes.

  • Model description: http://fvcom.smast.umassd.edu/research_projects/NECOFS/model_system.html
  • THREDDS server with other forecast and archive products: http://www.smast.umassd.edu:8080/thredds/catalog.html
In [2]:
# Plot forecast water levels from NECOFS model from list of lon,lat locations
# (uses the nearest point, no interpolation)
from pylab import *
import netCDF4
import datetime as dt
import pandas as pd
from StringIO import StringIO

In the cells below, specify a NECOFS dataset to extract data from. Massbay is most detailed for Massachusetts waters, GOM3 extends over entire Gulf of Maine (and more). There are forecasts runs, archived forecasts runs, and hindcast simulations available. Move the cell you are interested in to the bottom of the list, so it's executed last (thus overwriting the other URLs). Poor man coding!

In [3]:
# FVCOM GOM3v11 NECOFS Forecast Archive (Apr 01,2010-Nov 10,2011)
model='GOM3'
url='http://www.smast.umassd.edu:8080/thredds/dodsC/fvcom/archives/necofs_gom3v11'
In [4]:
# FVCOM GOM3v13 NECOFS Forecast Archive (May 09,2013-Jul 31,2013)
model='GOM3'
url='http://www.smast.umassd.edu:8080/thredds/dodsC/fvcom/archives/necofs_gom3v13'
In [5]:
# FVCOM GOM3v12 NECOFS Forecast Archive (Nov 11,2011-May 08,2013)
model='GOM3'
url='http://www.smast.umassd.edu:8080/thredds/dodsC/fvcom/archives/necofs_gom3v12'
In [6]:
#NECOFS GOM3 grid forecast
model='GOM3'
url='http://www.smast.umassd.edu:8080/thredds/dodsC/FVCOM/NECOFS/Forecasts/NECOFS_GOM3_FORECAST.nc'
In [7]:
#NECOFS MassBay grid forecast
model='Massbay'
url='http://www.smast.umassd.edu:8080/thredds/dodsC/FVCOM/NECOFS/Forecasts/NECOFS_FVCOM_OCEAN_MASSBAY_FORECAST.nc'

Specify a list of stations where you want forecast data. Just move the cell with the x= you want so that it gets executed last.

In [8]:
def dms2dd(d,m,s):
    return d+(m+s/60.)/60.
  
In [9]:
dms2dd(41,33,15.7)
Out[9]:
41.55436111111111
In [10]:
-dms2dd(70,30,20.2)
Out[10]:
-70.50561111111111
In [11]:
x = '''
Station, Lat, Lon
Falmouth Harbor,    41.541575, -70.608020
Sage Lot Pond, 41.554361, -70.505611
'''
In [12]:
x = '''
Station, Lat, Lon
Boston,             42.368186, -71.047984
Carolyn Seep Spot,    39.8083, -69.5917
Falmouth Harbor,  41.541575, -70.608020
'''
In [13]:
# Enter desired (Station, Lat, Lon) values here:
x = '''
Station, Lat, Lon
Boston,             42.368186, -71.047984
Scituate Harbor,    42.199447, -70.720090
Scituate Beach,     42.209973, -70.724523
Falmouth Harbor,    41.541575, -70.608020
Marion,             41.689008, -70.746576
Marshfield,         42.108480, -70.648691
Provincetown,       42.042745, -70.171180
Sandwich,           41.767990, -70.466219
Hampton Bay,        42.900103, -70.818510
Gloucester,         42.610253, -70.660570
'''
In [14]:
# Create a Pandas DataFrame
obs=pd.read_csv(StringIO(x.strip()), sep=",\s*",index_col='Station')
In [15]:
obs
Out[15]:
Lat Lon
Station
Boston 42.368186 -71.047984
Scituate Harbor 42.199447 -70.720090
Scituate Beach 42.209973 -70.724523
Falmouth Harbor 41.541575 -70.608020
Marion 41.689008 -70.746576
Marshfield 42.108480 -70.648691
Provincetown 42.042745 -70.171180
Sandwich 41.767990 -70.466219
Hampton Bay 42.900103 -70.818510
Gloucester 42.610253 -70.660570
In [16]:
# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x,y,xi,yi):
    ind=ones(len(xi),dtype=int)
    for i in arange(len(xi)):
        dist=sqrt((x-xi[i])**2+(y-yi[i])**2)
        ind[i]=dist.argmin()
    return ind
In [17]:
# open NECOFS remote OPeNDAP dataset 
nc=netCDF4.Dataset(url).variables
In [18]:
# find closest NECOFS nodes to station locations
obs['0-Based Index'] = nearxy(nc['lon'][:],nc['lat'][:],obs['Lon'],obs['Lat'])
obs
Out[18]:
Lat Lon 0-Based Index
Station
Boston 42.368186 -71.047984 90913
Scituate Harbor 42.199447 -70.720090 37964
Scituate Beach 42.209973 -70.724523 28474
Falmouth Harbor 41.541575 -70.608020 47470
Marion 41.689008 -70.746576 49654
Marshfield 42.108480 -70.648691 24272
Provincetown 42.042745 -70.171180 26595
Sandwich 41.767990 -70.466219 38036
Hampton Bay 42.900103 -70.818510 13022
Gloucester 42.610253 -70.660570 22082

Specify the range of dates you want to extract, either all time steps, a specified number of days around the current date or for a specific period in the past (for archive datasets)

In [19]:
# get time values and convert to datetime objects
time_var = nc['time']

# all time steps
istart = 0
istop = len(time_var)-1
In [20]:
# Use specific times (UTC) ...
start = dt.datetime(2012,2,16,0,0,0)
istart = netCDF4.date2index(start,time_var,select='nearest')
stop = dt.datetime(2012,2,18,0,0,0)
istop = netCDF4.date2index(stop,time_var,select='nearest')
In [21]:
# Use times near "NOW" ...
start = dt.datetime.utcnow() - dt.timedelta(days=1)
istart = netCDF4.date2index(start,time_var,select='nearest')
stop = dt.datetime.utcnow() + dt.timedelta(days=3)
istop = netCDF4.date2index(stop,time_var,select='nearest')
In [22]:
print 'istart,istop=',istart,istop
istart,istop= 85 144

In [23]:
# get all time steps of water level from each station
nsta=len(obs)
ii=range(istart,istop)
z=ones((len(ii),nsta))
for i in range(nsta):
    z[:,i] = nc['zeta'][ii,obs['0-Based Index'][i]]
    
In [24]:
jd = netCDF4.num2date(time_var[ii],time_var.units)
In [25]:
# make a DataFrame out of the interpolated time series at each location
zvals=pd.DataFrame(z,index=jd,columns=obs.index)
In [26]:
# list out a few values
zvals.head()
Out[26]:
Station Boston Scituate Harbor Scituate Beach Falmouth Harbor Marion Marshfield Provincetown Sandwich Hampton Bay Gloucester
2014-02-03 13:01:53 -1.548685 0.486000 -0.640000 0.050240 0.341351 -1.300522 -1.359574 -1.372528 -0.289000 -1.287029
2014-02-03 13:58:08 -0.804319 0.486000 -0.640000 0.311570 0.902277 -0.764226 -0.890149 -0.852898 -0.289000 -0.710075
2014-02-03 15:00:00 -0.128420 0.486000 -0.043429 0.533869 1.233727 -0.036886 -0.082767 -0.008128 -0.060291 -0.025076
2014-02-03 16:01:53 0.659285 0.748702 0.791304 0.548589 1.119893 0.758620 0.665184 0.720584 0.792306 0.827408
2014-02-03 16:58:08 1.568771 1.455214 1.462268 0.476323 0.730958 1.449316 1.387045 1.440940 1.415007 1.476997
In [27]:
# plotting at DataFrame is easy!
ax=zvals.plot(figsize=(16,4),grid=True,title=('NECOFS Forecast Water Level from %s Forecast' % model),legend=False);

# but of course we aren't exactly satisfied with the default plot, so...
# Specify exactly the way we want the dates labeled
xax = ax.get_xaxis()
xax.set_major_locator(DayLocator())
xax.set_major_formatter(DateFormatter('%b %d'))

xax.set_minor_locator(HourLocator(byhour=range(0,24,3)))
xax.set_minor_formatter(DateFormatter('%H'))# read units from dataset for ylabel
xax.set_tick_params(which='major', pad=15)
xax.grid(b=True,which='minor')
setp(gca().get_xticklabels(), rotation=0, horizontalalignment='center')
ylabel(nc['zeta'].units)
# plot the legend outside the axis frame
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5));
In [28]:
# make a new DataFrame of maximum water levels at all stations
b=pd.DataFrame(zvals.idxmax(),columns=['time of max water level (UTC)'])
# create heading for new column containing max water level
zmax_heading='zmax (%s)' % nc['zeta'].units
# Add new column to DataFrame
b[zmax_heading]=zvals.max()
In [29]:
b
Out[29]:
time of max water level (UTC) zmax (meters)
Station
Boston 2014-02-04 07:01:53 2.118573
Scituate Harbor 2014-02-04 07:01:53 2.013487
Scituate Beach 2014-02-04 07:01:53 2.015532
Falmouth Harbor 2014-02-04 04:58:08 0.595833
Marion 2014-02-04 04:01:53 1.235376
Marshfield 2014-02-04 07:01:53 2.022758
Provincetown 2014-02-04 07:01:53 2.066864
Sandwich 2014-02-04 07:01:53 2.114579
Hampton Bay 2014-02-04 07:01:53 1.962028
Gloucester 2014-02-04 07:01:53 1.957828
In [30]:
HTML(html)
Out[30]:

This post was written as an IPython notebook. It is available for download or as a static html.

Creative Commons License

Comments