Daria Sorokina

Publications



Refereed publications
2012 Jeremy Kubica, Sameer Singh and Daria Sorokina
Parallel Large-Scale Feature Selection.
In Scaling Up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press.

This material is copyright Cambridge University Press. It may be downloaded and printed for personal reference, but not otherwise copied, altered in any way or transmitted to others (unless explicitly stated otherwise) without the written permission of Cambridge University Press.

2009 Daria Sorokina, Rich Caruana, Mirek Riedewald, Wes Hochachka, Steve Kelling
Detecting and Interpreting Variable Interactions in Observational Ornithology Data.
In proceedings of the ICDM'09 International Workshop on Domain Driven Data Mining (DDDM'09).
2009 Daria Sorokina
Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup'09 Small Data Set. In JMLR Workshop and Conference Proceedings vol. 7: proceedings of KDD Cup'09 competition.
2009 Sameer Singh, Jeremy Kubica, Scott Larsen, Daria Sorokina
Parallel Large Scale Feature Selection for Logistic Regression.
In proceedings of SIAM International Conference on Data Mining (SDM'09).
2008 Daria Sorokina, Rich Caruana, Mirek Riedewald, Daniel Fink.
Detecting Statistical Interactions with Additive Groves of Trees.
In proceedings of the 25th International Conference on Machine Learning (ICML'08).
2007 Daria Sorokina, Rich Caruana, Mirek Riedewald.
Additive Groves of Regression Trees. In proceedings of the 18th European Conference on Machine Learning (ECML'07). (Best Student Paper award.)
2007 W. Hochachka, R. Caruana, A. Munson, M. Riedewald, D. Sorokina, D. Fink, S. Kelling.
Data-Mining Discovery of Pattern and Process in Ecological Systems.
Journal of Wildlife Management: 71(7), pp. 2427-2437.
2006 Daria Sorokina, Johannes Gehrke, Simeon Warner, Paul Ginsparg.
Plagiarism Detection in arXiv.
In proceedings of the 6th IEEE International Conference on Data Mining (ICDM'06).
2006 R. Caruana, M. Elhawary, A. Munson, M. Riedewald, D. Sorokina, D. Fink, W. Hochachka, S. Kelling.
Mining Citizen Science Data to Predict Prevalence of Wild Bird Species. In proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06).


Other materials - talks, tech reports, competition write-ups, etc.
2016 Daria Sorokina, Erick Cantu-Paz
Amazon Search: The Joy of Ranking Products.
In proceedings of SIGIR'16. Extended abstract of an industry track talk.
2010 Daria Sorokina
Application of Additive Groves to the Yahoo! Learning to Rank Challenge.
2009 Daria Sorokina, Alexander Sorokin
Additive Groves with very simple features for brain fibers classification
2009 Lujie Chen, Artur Dubrawski and Daria Sorokina
Multivariate Analysis for Predicting Risk of Microbial Contamination of Food.
In proceedings of International Society for Disease Surveillance conference(ISDS'09).
Abstract
2008 Daria Sorokina
Modeling Additive Structure and Detecting Interactions with Groves of Trees
PhD dissertation, Cornell University.
2006 Daria Sorokina, Johannes Gehrke, Simeon Warner, Paul Ginsparg.
Plagiarism Detection in arXiv.
Technical Report TR2006-2046, Computing and Information Science, Cornell University, 2006.
2003 Daria Sorokina, Mikhail Petrovskiy.
Adaptation of the Fuzzy Decision Tree Algorithm for Multidimensional Datacubes.
Collected Articles on Software Systems and Tools, CMC MSU publishing, Moscow, Russia.
2003 Daria Erofeyeva.
Fuzzy Approach to Classification for Multidimensional Datacubes.
Diplom thesis, Moscow State University.