You will find a shut link involving machine learning and compression. A method that predicts the posterior probabilities of a sequence offered its complete heritage can be employed for optimum data compression (through the use of arithmetic coding on the output distribution). Clustering by using Substantial Indel Permuted Slopes, CLIPS, http://finance.yahoo.com/news/ventura-provides-cutting-edge-los-114000081.html