Daria Sorokina

Contact info

e-mail address is hiding here

cell phone: +1-914-980-2028

Daria's photo
My main area of research is machine learning algorithms, and I am particularly interested in applications involving large real world data sets. For the past few years I've been working on projects related to search.

I am currently at LinkedIn, in the Product Data Science team. Here is my LinkedIn profile.

I have graduated from Cornell University in 2008. My PhD thesis topic concerned modeling additive structure and detecting statistical interactions: we have developed new method for these problems - Additive Groves of regression trees. This algorithm was used in winning entries of several major data mining competitions. It is released and supported as a part of TreeExtra package, see the link below.

My other current and past research topics include search relevance, ranking, active learning, feature selection/evaluation, soft (fuzzy) decision trees, spectral clustering and plagiarism detection.

TreeExtra package: code for Additive Groves and more.
My official resume, last updated 14 Dec 2011
Publications, talks, competition write-ups. Most texts are available.
Last modified 21 Apr 2012