Dealing with Incomplete Data: Scientific, Practical, and Ethical Considerations

David Brooks, April 2011

Climate scientists deal every day with uncertainties. Sometimes those uncertainties are the result of having incomplete data. The purpose of this exercise is to provide a concrete example of such a situation.

Suppose you are developing a project that requires knowing the yearly changes in total solar energy reaching the ground at a NOAA CRN (Climate Reference Network) site in Asheville, NC. These sites are intended to provide the best possible data for assessing climate now and in the future. But, no data collection system is perfect. You download the file that contains the daily integrated insolation from 2002-2010 and when you examine it you find there are several days with no data.

You know that you cannot calculate the total yearly solar energy reaching the ground if data for some of the days are missing. What are your choices? Are there appropriate and inappropriate choices? Explain and justify how you would deal with these data.

This file is available here. Here are the days with missing data. The date (YYYYMMDD) is in the second column and the integrated daily insolation, in units of megajoules per square meter per day (MJ/m2/day), is in the rightmost column. -9999.0 or -9999 represents a missing value.

53877 20020402  1.006  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20031021  1.006  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20031022  1.201  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
…
53877 20031024  1.201  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20041028  1.201  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20051213  1.201  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20070221  1.303  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20071203  1.303  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20081029  1.303  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20081030  1.303  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20090618  1.303  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20091120  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20091219  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091220  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091221  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091222  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091223  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091224  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091225  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091226  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091227  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091228  -9999  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 
53877 20091229  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20100629  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20101112  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00 

53877 20101229  2.402  -82.61   35.49 -9999.0 -9999.0 -9999.0 -9999.0 -9999.0 -9999.00