Rich Signell's Notebook Blog

Blogging with IPython Notebooks

CKAN test

We want to find all Data.gov datasets that match a specific type of data (e.g. sea_water_temperature), in a specified geospatial extent and time window, and that have a specific type of data endpoint (e.g. OPeNDAP). Since data.gov uses CKAN, while waiting for a CSW interface, here we try using the CKAN API with the requests package and with the ckanclient package.

In [2]:
import requests
import json
from pprint import pprint

Let's try doing a request first with the requests package

In [3]:
f = requests.get("http://catalog.data.gov/api/3/search/dataset?q=%22sea_water_temperature%22").json()
print f.keys()
print f['count']
pprint(f['results'][0:5])
[u'count', u'results']
52
[u'physical-and-meteorological-data-from-the-tropical-atmosphere-ocean-tao-array-in-the-tropical-p2faf2',
 u'physical-and-meteorological-data-from-the-tropical-atmosphere-ocean-tao-array-in-the-tropical-p43e69',
 u'physical-and-meteorological-data-from-the-tropical-atmosphere-ocean-tao-array-in-the-tropical-p44495',
 u'physical-and-meteorological-data-from-the-tropical-atmosphere-ocean-tao-array-in-the-tropical-p5d444',
 u'physical-and-meteorological-data-from-the-tropical-atmosphere-ocean-tao-array-in-the-tropical-p652ed']

Now let's try the ckanclient

In [4]:
import ckanclient
In [5]:
ckan = ckanclient.CkanClient('http://catalog.data.gov/api/3')

Try the same simple search we did with requests above

In [6]:
search_params = { 'q': 'tags:"sea_water_temperature" '}                                                                                            
d = ckan.action('package_search', **search_params) 
print d['count']
51

Hmm... Interesting. With ckanclient we got 51 results back, where with requests, we got 52. Could be a 0- vs 1 based index thing?
Now let's try a more complex search, asking for only 3 results back. Below we ask for res_format:PDF, but what are the other possible res_format values we can query on?

In [7]:
search_params = {                                                                                      
    'q': 'tags:"sea_water_temperature" AND metadata_modified:[2012-06-01T00:00:00.000Z TO NOW]',       
    'fq': 'res_format:PDF',                                                                            
    'extras': {"ext_bbox":"-121,45,-120,46"},                                                          
    'rows': 3                                                                                       
}
d = ckan.action('package_search', **search_params) 
print d['count']
7

In [8]:
search_params = {                                                                                      
    'q': 'tags:"temperature"',                                                                                    'fq': 'res_format:PDF',                                                                            
    'extras': {"ext_bbox":"-125,38,-124,39"},   
    'rows': 10                                                                                       
}
      
d = ckan.action('package_search', **search_params) 
print d['count']
10

So what does one of these results look like? Let's take a look at the keys

In [9]:
print d['results'][0].keys()
[u'license_title', u'maintainer', u'relationships_as_object', u'private', u'maintainer_email', u'num_tags', u'id', u'metadata_created', u'metadata_modified', u'author', u'author_email', u'state', u'version', u'license_id', u'type', u'resources', u'num_resources', u'tags', u'tracking_summary', u'groups', u'organization', u'relationships_as_subject', u'revision_timestamp', u'name', u'isopen', u'url', u'notes', u'owner_org', u'extras', u'title', u'revision_id']

In [10]:
pprint(d['results'][0]['resources'])
[{u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479373',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'ae4f1268-5e50-4ac8-aee8-81d75504f92d',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 0,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074922',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479392',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'567f688c-6648-4328-b9ca-e46bd1dd56cc',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 1,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479397',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'65d1f965-27d6-429c-b1b2-b374d07d78e6',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 2,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/07',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479401',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'10c5d503-89cc-4b2d-89da-0fe758e37d18',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 3,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479405',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'53c1b069-fd71-457c-8292-fe97d2679a0e',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 4,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://accession.nodc.noaa.gov/0074922',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479408',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'f8e30cb9-372a-46e8-9c8a-dd931c29f593',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 5,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://accession.nodc.noaa.gov/0070493',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479412',
  u'description': u'',
  u'format': u'',
  u'hash': u'',
  u'id': u'9ebc0d66-110e-4c32-9e12-84918dbe7eb0',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'Web Page',
  u'position': 6,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2',
  u'webstore_last_updated': None,
  u'webstore_url': None},
 {u'cache_last_updated': None,
  u'cache_url': None,
  u'created': u'2013-05-20T01:41:12.479416',
  u'description': u'eval_NDBC_moored_winds.pdf',
  u'format': u'PDF',
  u'hash': u'',
  u'id': u'3746c191-6072-4e36-9f8f-e71798b797e1',
  u'last_modified': None,
  u'mimetype': None,
  u'mimetype_inner': None,
  u'name': u'PDF File',
  u'position': 7,
  u'resource_group_id': u'bcd1a009-703b-4563-bf0d-033a605d4661',
  u'resource_locator_function': u'',
  u'resource_locator_protocol': u'',
  u'resource_type': None,
  u'revision_id': u'45ed7bd3-2bc4-46e1-bd52-88783cbdd676',
  u'revision_timestamp': u'2013-05-19T23:41:12.457220',
  u'size': None,
  u'state': u'active',
  u'tracking_summary': {u'recent': 0, u'total': 0},
  u'url': u'http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf',
  u'webstore_last_updated': None,
  u'webstore_url': None}]

In [11]:
print d['results'][0]['resources'][0]['url']
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074922

Now let's see what the urls looks like for all the resources

In [12]:
for item in d['results']:
    for member in item['resources']:
        print 'url:',member['url']
        print 'protocol:',member['resource_locator_protocol']
        print 'resource_type:',member['resource_type']
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074922
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/07
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0074922
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0070959
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/01
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070959
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0071368
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/02
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0071368
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0073426
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/05
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0073426
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0072077
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0072077
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074384
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/06
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0074384
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0072886
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://data.nodc.noaa.gov/opendap/ndbc/cmanwx/2011/04
protocol: 
resource_type: None
url: ftp://ftp.nodc.noaa.gov/pub/data.nodc/ndbc/cmanwx/
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0072886
protocol: 
resource_type: None
url: http://accession.nodc.noaa.gov/0070493
protocol: 
resource_type: None
url: http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0097%3ADWFEOT%3E2.0.CO%3B2
protocol: 
resource_type: None
url: http://www.worldwindsinc.com/TechPapers/dgil/eval_NDBC_moored_winds.pdf
protocol: 
resource_type: None
url: http://repository.stategeothermaldata.org/metadata/record/98ddf901b9782a25982e01af3b0b6710/file/ca_sanyal_geopres_resources_in_ca.pdf
protocol: 
resource_type: None
url: ftp://ftp.consrv.ca.gov/pub/oil/publications/tr15.pdf
protocol: 
resource_type: None
url: http://repository.stategeothermaldata.org/metadata/record/98ddf901b9782a25982e01af3b0b33f8/file/the_nature_and_occurrence_of_geopressured_resource_areas_in_the_state_of_california_-_by_subir_sanyal_etal.pdf
protocol: 
resource_type: None

So there are multiple resources for each record. Let's check out a some specific resource urls for all datasets

In [13]:
resource_number=0
for item in d['results']:
    print item['resources'][resource_number]['url']
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074922
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0070959
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0071368
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0073426
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0072077
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0074384
http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/0072886
http://repository.stategeothermaldata.org/metadata/record/98ddf901b9782a25982e01af3b0b6710/file/ca_sanyal_geopres_resources_in_ca.pdf
ftp://ftp.consrv.ca.gov/pub/oil/publications/tr15.pdf
http://repository.stategeothermaldata.org/metadata/record/98ddf901b9782a25982e01af3b0b33f8/file/the_nature_and_occurrence_of_geopressured_resource_areas_in_the_state_of_california_-_by_subir_sanyal_etal.pdf

In [14]:
resource_number=1
for item in d['results']:
    print item['resources'][resource_number]['url']
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-14-40679c96a025> in <module>()
      1 resource_number=1
      2 for item in d['results']:
----> 3     print item['resources'][resource_number]['url']

IndexError: list index out of range
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/
http://data.nodc.noaa.gov/thredds/catalog/ndbc/cmanwx/

Hmm... These are not data granules, but data collections. We need the actually OPeNDAP dataset URLs. How do we get from this catalog information to the actual dataset service endpoints? Let's try looking for NetCDF

In [15]:
search_params = {                                                                               
    'q': 'tags:"sea_water_temperature"',
    'fq': 'res_format:NetCDF',
    'rows': 5                                                                                                                            
}
d = ckan.action('package_search', **search_params) 
print d['count']
0

Hmmm.. We got skunked. Maybe try just temperature instead of sea_water_temperature?

In [16]:
search_params = {                                                                               
    'q': 'tags:"temperature"',
    'fq': 'res_format:NetCDF',
    'rows': 5                                                                                                                            
}
d = ckan.action('package_search', **search_params) 
print d['count']
2574

Well, that's better!

In [17]:
for item in d['results']:
    for member in item['resources']:
        print member['url']
http://data.nodc.noaa.gov/opendap/nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc.html
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/WES001/2010/catalog.html?dataset=nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/WES001/2010/catalog.html?dataset=nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc.html
http://data.nodc.noaa.gov/thredds/dodsC/nmsp/wcos/WES001/2010/catalog.html?dataset=nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc.html
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2005/catalog.html?dataset=nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2005/catalog.html?dataset=nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc.html
http://data.nodc.noaa.gov/thredds/dodsC/nmsp/wcos/SMS/2005/catalog.html?dataset=nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc.html
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/TH042/2005/catalog.html?dataset=nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/TH042/2005/catalog.html?dataset=nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc.html
http://data.nodc.noaa.gov/thredds/dodsC/nmsp/wcos/TH042/2005/catalog.html?dataset=nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc.html
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/CE042/2006/catalog.html?dataset=nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/CE042/2006/catalog.html?dataset=nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc.html
http://data.nodc.noaa.gov/thredds/dodsC/nmsp/wcos/CE042/2006/catalog.html?dataset=nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc.html
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2008/catalog.html?dataset=nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2008/catalog.html?dataset=nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc.html
http://data.nodc.noaa.gov/thredds/dodsC/nmsp/wcos/SMS/2008/catalog.html?dataset=nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc

In [18]:
resource_number=0
for item in d['results']:
    print item['resources'][resource_number]['url']
http://data.nodc.noaa.gov/opendap/nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc.html
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc.html
http://data.nodc.noaa.gov/opendap/nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc.html
http://data.nodc.noaa.gov/opendap/nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc.html
http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc.html

In [19]:
resource_number=1
for item in d['results']:
    print item['resources'][resource_number]['url']
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/WES001/2010/catalog.html?dataset=nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2005/catalog.html?dataset=nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/TH042/2005/catalog.html?dataset=nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/CE042/2006/catalog.html?dataset=nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2008/catalog.html?dataset=nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc

In [20]:
resource_number=2
for item in d['results']:
    print item['resources'][resource_number]['url']
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/WES001/2010/catalog.html?dataset=nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2005/catalog.html?dataset=nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/TH042/2005/catalog.html?dataset=nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/CE042/2006/catalog.html?dataset=nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc
http://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/SMS/2008/catalog.html?dataset=nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc

Let's get a list of the OPeNDAP Data URLs

In [21]:
resource_number=3
dap_url=[]
for item in d['results']:
    dap_url.append(item['resources'][resource_number]['url'].split('.html')[0])
pprint(dap_url)
[u'http://data.nodc.noaa.gov/opendap/nmsp/wcos/WES001/2010/WES001_030MTBD011R00_20100416.nc',
 u'http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2005/SMSXXX_015MTBD014R00_20051113.nc',
 u'http://data.nodc.noaa.gov/opendap/nmsp/wcos/TH042/2005/TH042X_042MTBD000R00_20050628.nc',
 u'http://data.nodc.noaa.gov/opendap/nmsp/wcos/CE042/2006/CE042X_042MTBD007R00_20060504.nc',
 u'http://data.nodc.noaa.gov/opendap/nmsp/wcos/SMS/2008/SMSXXX_015MTBD003R00_20080227.nc']

In [21]:

Now let's try to read one of the OPeNDAP Data URLs

In [22]:
import netCDF4
In []:
nc = netCDF4.Dataset(dap_url[0])
In []:
nc.Conventions
In []:
print nc.variables.keys()
ncvar=nc.variables
In []:
print ncvar['yearday']
print ncvar['Temperature']

Uh, this says it's CF 1.4, but no way is it CF 1.4. There are not even units on time!
The long_name of time is just Day of the Year, but there doesn't even seem to be information in the file to say what year:

In []:
for name in nc.ncattrs():
    print name, '=', getattr(nc,name)

So if we are going to plot this, it's going to be with a human trying to figure out what is what:

In []:
len=10000
yd=ncvar['yearday'][0:len].flatten()
t=ncvar['Temperature'][0:len].flatten()
figsize(12,4)
plot(yd,t)

Still don't know what year this is. Guess 2012 since that's when the conversion script was run by NCDDC?

In []:
HTML(html)
In []:

Comments