First of all, remember, a digital image is a matrix in which each cell contains a different value, a number or group of numbers (think RGB images, which contains a triplet). That number defines an intensity within a certain scale (8 bit, 16 bit, 32 bit images) and certain conventions (black is 0, white is the maximum value). When you change the value of a pixel, it reflects in a change in the grayscale or color representation on the screen. Filters are mathematical functions that generate new pixel values from old ones. There are many applications to this, from noise removal to edge detection.
The process of assigning new pixel values depending on the values of each pixel and its neighbors is called filtering in the spatial domain, and is achieved through a mathematical operation called convolution. In our context, it consists of taking the original image and a second, smaller one, called kernel or mask. The values in the mask are called weights...