Linear separability implies that if there are two classes then there will be a point, line, plane, or hyperplane that splits the input features in such a way that all points of one class are in one-half space and the second class is in the other half-space.
For example, here is a case of selling a house based on area and price. We have got a number of data points for that along with the class, which is house Sold/Not Sold:
In the preceding figure, all the N, are the class (event) of Not Sold, which has been derived based on the Price and Area of the house and all the instances of S represent the class of the house getting sold. The number of N and S represent the data points on which the class has been determined.
In the first diagram, N and S are quite close and happen to be more random, hence, it's difficult to have linear separability achieved as no matter how you try to separate two classes, at least one of them would be in the misclassified region. It implies that there...