I’ve got many 1d arrays of data which contain occasional NaNs where there weren’t any samples at that depth bin. Something like this…
But much bigger, and I have hundreds of them. Most NaN’s are isolated between two valid values, but they still make my contour plots look terrible.
Rather than just mask them, I want to interpolate so my plot doesn’t have holes in it where it need not.
I want to change any NaN which is preceded and followed by a value to the average of those two values.
If it only has one valid neighbor, I want to change it to the values of it’s neighbor.
Here’s a simplified version of my code:
from copy import copy
import numpy as np
sample_array = np.array(([np.nan,1,2,3,np.nan,5,6,7,8,np.nan,np.nan,11,12,np.nan,np.nan,np.nan]))
#Make a copy so we aren’t working on the original
cast = copy(sample_array)
#Now iterate over the copy
for j,sample in enumerate(cast):
# If this sample is a NaN, let’s try to interpolate
if np.isnan(sample): #Get the neighboring values, but make sure we don't index out of bounds prev_val = cast[max(j-1,0)] next_val = cast[min(j+1,cast.size-1)] print "Trying to fix",prev_val,"->",sample,"<-",next_val # First try an average of the neighbors inter_val = 0.5 * (prev_val + next_val) if np.isnan(inter_val): #There must have been an neighboring Nan, so just use the only valid neighbor inter_val = np.nanmax([prev_val,next_val]) if np.isnan(inter_val): print " No changes made" else: print " Fixed to",prev_val,"->",inter_val,"<-",next_val #Now fix the value in the original array sample_array[j] = inter_val
After this is run, we have:
sample_array = array([1,1,2,3,4,5,6,7,8,8,11,11,12,12,np.nan,np.nan])
works, but is very slow for something that will be on the back end of a web page.
Perhaps something that uses masked arrays and some of
the numpy.ma methods?
I keep thinking there must be some much more clever way of doing this.