..

scoring-models

[!question]- What is the problem with boolean search

[!question]- What is the problem with jaccard coefficient

Procedure

  • Find tf(t, d) which is the number of times that term occurs in the document

  • Convert this to a log scale 1 + log(tf(t, d)) if tf > 1 else 0

  • Find the idf(t)

  • Multiply these together

  • Normalize the vector

  • Do the same for the query

  • Find the dot product