New Bright Score Release

New Bright Score Release
In the Data Science tradition of deploying new code on Friday, giving it the whole weekend to be broken, we’re pushing out a new version of the bright score code today.
As far as the changes go, the nerdy explanation is that we did a Principal Component Analysis (PCA) for some of our strongest features to create an orthogonal feature space that reduces double-counting……. What this means is that we’re combining some of the features so that we don’t give too much extra credit for getting a good score on similar features. For example, we have 2 different metrics for matching the text in a person’s resume and the text in a job description. One is fuzzy match of words and one is an exact match. If a person scores high on the exact match, then they’re going to score well on the fuzzy match, but a person won’t necessarily score well on the exact matching with a job if they have a good fuzzy match. So, instead of treating these 2 features as completely independent, we combine them in an equation that figures out how independent they are and scores them based on that.
With these changes, we should see a reduction in over-scoring and thus false positives. The speed should stay about the same. If you notice anything funky going down, alert your nearest DS member and we’ll check it out.
No comments yet.