During my studies I attended a course called "NLP" where we learned different techniques of natural language processing. One exceptionally nifty technique I learned about was text similarity using a vector space model. I was amazed by how little it actually takes to be able to compare two or more texts.

In this post I'll explain how text similarity can be measured with a vector space model and do a practical example in PHP!

