(Thanks to Guilherme Ludwig.)
Images in computers are stored as matrices, usually one for each of three color channels. A mean filter for a window of size \(k\) is a transformation of a color channel such that each pixel \(x_{i,j}\) is replaced by an average of some nearby values of the same color channel (depending on \(k\)).
Suppose the window is a square. For \(k=1\), the window centered around the \(x_{i,j}\)-th component of a matrix is
\[ \begin{pmatrix} x_{i-1,j-1} & x_{i-1,j} & x_{i-1,j+1} \\ x_{i,j-1} & x_{i,j} & x_{i,j+1} \\ x_{i+1,j-1} & x_{i+1,j} & x_{i+1,j+1} \\ \end{pmatrix} \]
and the filtered \(x^\ast_{i,j}\) is the average of all 9 values. For \(k=2\), the window becomes instead
\[ \begin{pmatrix} x_{i-2,j-2} & x_{i-2,j-1} & x_{i-2,j} & x_{i-2,j+1} & x_{i-2,j+2} \\ x_{i-1,j-2} & x_{i-1,j-1} & x_{i-1,j} & x_{i-1,j+1} & x_{i-1,j+2} \\ x_{i,j-2} & x_{i,j-1} & x_{i,j} & x_{i,j+1} & x_{i,j+2} \\ x_{i+1,j-2} & x_{i+1,j-1} & x_{i+1,j} & x_{i+1,j+1} & x_{i+1,j+2} \\ x_{i+2,j-2} & x_{i+2,j-1} & x_{i+2,j} & x_{i+2,j+1} & x_{i+2,j+2} \\ \end{pmatrix} \]
and the filtered \(x^\ast_{i,j}\) is the average of all 25 values.
For example, here is a \(5 \times 5\) pixels image red channel whose values I made up:
\[ \begin{pmatrix} 0.5 & 0.5 & 0.4 & 0.4 & 0.3 \\ 0.5 & 0.5 & 0.4 & 0.4 & 0.3 \\ 0.4 & 0.4 & 0.3 & 0.3 & 0.2 \\ 0.4 & 0.3 & 0.2 & 0.2 & 0.2 \\ 0.3 & 0.3 & 0.2 & 0.1 & 0.1 \\ \end{pmatrix} \]
For \(k = 1\), a filtered version of this matrix replaces \(x_{2,2} = 0.5\) with \(x^\ast_{2,2} = 0.4333\), \(x_{2,3} = 0.4\) with \(x^\ast_{2,3} = 0.4\), \(x_{3,2} = 0.4\) with \(x^\ast_{3,2} = 0.3777\), etc. For the edge values (like \(x_{1,1}\)), treat the values outside of the matrix as zeros (there are better choices). This is calling “padding” the image with zeros. For example, for \(k = 1\) we might enlarge the matrix so that it becomes
\[ \begin{pmatrix} 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.5 & 0.5 & 0.4 & 0.4 & 0.3 & 0.0 \\ 0.0 & 0.5 & 0.5 & 0.4 & 0.4 & 0.3 & 0.0 \\ 0.0 & 0.4 & 0.4 & 0.3 & 0.3 & 0.2 & 0.0 \\ 0.0 & 0.4 & 0.3 & 0.2 & 0.2 & 0.2 & 0.0 \\ 0.0 & 0.3 & 0.3 & 0.2 & 0.1 & 0.1 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \end{pmatrix} \]
and consequently \(x^\ast_{1,1} = 0.2222\). To see this in action, let’s implement some code:
# original matrix
X <- matrix(c(.5,.5,.4,.4,.3,.5,.5,.4,.3,.3,.4,.4,.3,.2,.2,.4,.4,.3,.2,.1,.3,.3,.2,.2,.1), ncol=5)
X
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.5 0.5 0.4 0.4 0.3
## [2,] 0.5 0.5 0.4 0.4 0.3
## [3,] 0.4 0.4 0.3 0.3 0.2
## [4,] 0.4 0.3 0.2 0.2 0.2
## [5,] 0.3 0.3 0.2 0.1 0.1
k <- 1
pad.X <- matrix(0, dim(X)[1]+2*k, dim(X)[2]+2*k)
pad.X[(k+1):(dim(X)[1]+k), (k+1):(dim(X)[2]+k)] <- X
pad.X
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0 0.0 0.0 0.0 0.0 0.0 0
## [2,] 0 0.5 0.5 0.4 0.4 0.3 0
## [3,] 0 0.5 0.5 0.4 0.4 0.3 0
## [4,] 0 0.4 0.4 0.3 0.3 0.2 0
## [5,] 0 0.4 0.3 0.2 0.2 0.2 0
## [6,] 0 0.3 0.3 0.2 0.1 0.1 0
## [7,] 0 0.0 0.0 0.0 0.0 0.0 0
image(t(X[nrow(X):1,]), main="Raw") # image() needs to rotate and flip matrices for display
After padding, take the averages around each non-pad entry, and crop the padding out (code hidden):
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.2222222 0.3111111 0.2888889 0.2444444 0.15555556
## [2,] 0.3111111 0.4333333 0.4000000 0.3333333 0.21111111
## [3,] 0.2777778 0.3777778 0.3333333 0.2777778 0.17777778
## [4,] 0.2333333 0.3111111 0.2555556 0.2000000 0.12222222
## [5,] 0.1444444 0.1888889 0.1444444 0.1111111 0.06666667
Save the file Van_Gogh.png (“Wheatfield with Crows”) in the same directory as your script. (Run “Session” -> “Set Working Directory” if necessary; do not change the file name. The readPNG()
function we’ll use does not take hyperlink connections.) Here is the image:
Now we’ll read the image into R and decompose it into three color channels:
if (!require("png")) {
install.packages("png")
stopifnot(require("png"))
}
## Loading required package: png
str(vg <- readPNG("Van_Gogh.png"))
## num [1:374, 1:800, 1:3] 0.1255 0.3647 0.3059 0.1098 0.0235 ...
dim(vg)
## [1] 374 800 3
red.vg <- vg[,,1]
green.vg <- vg[,,2]
blue.vg <- vg[,,3]
str(blue.vg)
## num [1:374, 1:800] 0.1255 0.2588 0.3373 0.2078 0.0235 ...
The readPNG()
function returns an array, a generalization of a matrix to more than 2 dimensions. Here the third dimension holds the color channels, 1=“Red”, 2=“Green”, and 3=“Blue”, respectively. Taking all rows and all columns, with a fixed third index, gives a matrix; for example, blue.vg
is a matrix. The data need to be transposed and flipped horizontally to be displayed correctly, but that’s not necessary for your calculations (see ?image
). Here are the channels individually:
layout(matrix(1:3, ncol=3))
image(t(red.vg[nrow(red.vg):1,]), col = gray((1:12)/13), main="Red channel")
image(t(green.vg[nrow(green.vg):1,]), col = gray((1:12)/13), main="Green channel")
image(t(blue.vg[nrow(blue.vg):1,]), col = gray((1:12)/13), main="Blue channel")
We will run the three channel filtering in parallel. Here are the steps:
array
by using the array()
function. This array must have dimensions \(n \times m \times 3\).writePNG()
function to create USERNAME_k.png
, where k
is the window size.USERNAME_1.png
, USERNAME_3.png
and USERNAME_5.png
, using the respective values for k
.