Additive Groves code: TreeExtra package

TreeExtra is a set of tools implementing the following algorithms:

Additive Groves is an ensemble of regression trees developed by Daria Sorokina, Rich Caruana and Mirek Riedewald.

Feature evaluation technique referred as "multiple counts" is developed by Art Munson and all of the above.

All code is written by Daria Sorokina unless stated otherwise. The code is available under BSD license and is free to use for any purpose. (It also makes use of external libraries available under LGPL license.)

Contact: Daria Sorokina ( fi...@gmail.com )

Please e-mail me any comments, suggestions, bug reports or feature requests. I am interested in how my algorithm is doing: if you have successfully (or unsuccessfully) applied Additive Groves to your data, I'd be happy to hear about your experience with it.

Updates Manuals Downloads Research papers

Updates

Manuals

Downloads

TreeExtra 2.3

Additional code

Earlier TreeExtra versions

Research papers and presentations

Daria Sorokina.
Application of Additive Groves to the Yahoo! Learning to Rank Challenge.

Daria Sorokina.
Modeling Additive Structure and Detecting Interactions with Additive Groves of Regression Trees
CMU Machine Learning Lunch, March 2010
Video (You need to scroll down to March 1 2010 talk. Sound is bad for the first few minutes only.)
Slides (.ppt)

Daria Sorokina, Rich Caruana, Mirek Riedewald, Wes Hochachka, Steve Kelling.
Detecting and Interpreting Variable Interactions in Observational Ornithology Data.
In proceedings of the ICDM'09 Workshop on Domain Driven Data Mining (DDDM'09).

Daria Sorokina.
Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup'09 Small Data Set.
In proceedings of the KDD Cup 2009 workshop.

Daria Sorokina.
Modeling Additive Structure and Detecting Interactions with Groves of Trees.
PhD dissertation, Cornell University, 2008.

Daria Sorokina, Rich Caruana, Mirek Riedewald, Daniel Fink.
Detecting Statistical Interactions with Additive Groves of Trees.
In proceedings of the 25th International Conference on Machine Learning (ICML'08).
Video of ICML presentation
Slides (.ppt)

Daria Sorokina, Rich Caruana, Mirek Riedewald.
Additive Groves of Regression Trees.
In proceedings of the 18th European Conference on Machine Learning (ECML'07) (Best Student Paper award.)
Video of ECML presentation
Slides (.ppt)

R. Caruana, M. Elhawary, A. Munson, M. Riedewald, D. Sorokina, D. Fink, W. Hochachka, S. Kelling.
Mining Citizen Science Data to Predict Prevalence of Wild Bird Species. <-- feature evaluation methods described here
In proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06).