Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
PyML: machine learning in Python (sourceforge.net)
63 points by rogercosseboom on Feb 22, 2009 | hide | past | favorite | 17 comments


I am currently playing around with it. Decent support vector machine implementation. However, I have some problems with it. It is by far not as hugh as weka http://www.cs.waikato.ac.nz/ml/weka/ It is in python (big plus) and seems to be easily hacked for online classification. I like https://mlpy.fbk.eu/ a bit better, as it has also a decent integrated Nearest Neighbor (ok, ok this one is not hard to implement on your own) and FDA + DWT (awsome).


Hey, what do FDA and DWT stand for? (Sorry, that link doesn't work for me.) Thanks!


Google, being a good friend of mine, didn't mind me asking. His answer: Fisher Discriminant Analysis (FDA), Discrete Wavelet Transform (DWT). I'll introduce you: http://google.com


I did search for DWT. I don't understand how wavelets would be used in this context, so I was hoping for some verification.


Weka can be used with JRuby, or Jython (I assume).


PySVM might have been a better name, since it looks like it only does SVM and kernel methods. I haven't used this package, but I'll recommend libsvm and liblinear because they're fast and have wrappers for just about every language you want, including Java, Ruby, Matlab, and Python http://www.csie.ntu.edu.tw/~cjlin/libsvm/#python

If you're doing large scale linear classification, liblinear is especially awesome and fast. http://www.csie.ntu.edu.tw/~cjlin/liblinear/

EDIT: From the PyML documentation:

"By default the libsvm solver is used in training. To use the PyML SMO optimizer either set the optimizer attribute to ’mysmo’ or instantiate an svm instance as svm.SVM(optimizer = ’mysmo’). Note that for a non-vector dataset, the default libsvm optimizer cannot be used and the PyML native SMO implementation is automatically used instead (it is slower than libsvm so is not the default)"


Yes, great lib (totally forget about the python bindings). I was using it under matlab works like a charm


One serious problem of HN is that the amount of good stuff on here greatly exceeds my capacity for absorbing it all... Roughly one week goes by and I have at least another months worth of reading added to my list.


http://www.ailab.si/orange/

This is the most thorough Python machine learning package I know of. Includes SVM.

Lots of tutorials, too, and a visual development environment (if that's your style).


Wow, sounds impressive, but that link isn't working for me unfortunately! Do you know of a mirror?


Blast, it was fine just two hours ago.

http://74.125.47.132/search?q=cache:iqMjX8uGgroJ:www.ailab.s...

Google cache, to whet your appetite...


Link working now, FYI.


But Orange library is not actively developed/maintained. Right?


Well, it is academic software. I work at ailab.si and a lot of my work concerns Orange, especially its bioinformatics addon. Its development is slow because we also need to do some other things, like research. New features are usually added when we need them.

It is actively maintained/developed but by only few developers, so its progress can seem slow at times.


If only there were some good non-GPL machine learning libraries for Python...


There is a start at it here:

http://www.scipy.org/scipy/scikits/wiki/MachineLearning

Still some active development, but needs more people:

http://projects.scipy.org/scipy/scikits/browser/trunk/learn/...


Has anyone used this? Care to share your experience?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: