this post was submitted on 09 Nov 2023
21 points (83.9% liked)

Programming

21249 readers
31 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
 

So I wanted to get into ML using Python recently and I was wondering about which ML library I should learn as a ML beginner first. I've been using Python for a few years now.

top 16 comments
sorted by: hot top controversial new old
[–] AlmightySnoo@lemmy.world 15 points 2 years ago* (last edited 2 years ago) (1 children)

I'd say since you're a beginner, it's much better to try to implement your regression functions and any necessary helper functions (train/test split etc...) yourself in the beginning. Learn the necessary linear algebra and quadratic programming and try to implement linear regression, logistic regression and SVMs using only numpy and cvxpy.

Once you get the hang of it, you can jump straight into sklearn and be confident that you understand sort of what those "blackboxes" really do and that will also help you a lot with troubleshooting.

For neural networks and deep learning, pytorch is imposing itself as an industry standard right now. Look up "adjoint automatic differentiation" ("backpropagation" doesn't do it any justice as pytorch instead implements a very general dynamic AAD) and you'll understand the "magic" behind the gradients that pytorch gives you. Karpathy's YouTube tutorials are really good to get an intro to AAD/autodiff in the context of deep learning.

[–] Asudox@lemmy.world 5 points 2 years ago (1 children)

So I should learn sklearn first before pytorch to understand the basics?

[–] AlmightySnoo@lemmy.world 5 points 2 years ago* (last edited 2 years ago) (1 children)

Linear and logistic regression are much easier (and less error prone) to implement from scratch than neural network training with backpropagation.

That way you can still follow the progression I suggested: implement those regressions by hand using numpy -> compare against (and appreciate) sklearn -> implement SVMs by hand using cvxpy -> appreciate sklearn again.

If you get the hang of "classical" ML, then deep learning becomes easy as it's still machine learning, just with more complicated models and no closed-form solutions.

[–] Asudox@lemmy.world 3 points 2 years ago

Aight thanks.

[–] rutrum@lm.paradisus.day 6 points 2 years ago

For more "traditional" or "statistical" modeling (not NN) 100% start with sklearn. It has a plethora of algorithms, and their docs read like a book. You can learn a whole bunch of new methods and techniques from there too. In tandum, you should familiarize yourself with matplotlib, which is the plotting library it uses under the hood (and is by far the most popular plotting library.)

For deep learning, I'd say PyTorch? Tensorflow used to be standard but its fallen out of favor compared to PyTorch. I don't use either so I'm nit sure.

[–] 4shtonButcher@discuss.tchncs.de 5 points 2 years ago (1 children)

Maybe find some code to look at on the HuggingFace hub page? HuggingFace libraries or PyTorch are likely to give you really good learning opportunities and examples. Just keep an eye out for timestamps of articles or version numbers. And of course use venv/conda/.. to not mess up your version when trying out different things 😉

[–] Asudox@lemmy.world 2 points 2 years ago* (last edited 2 years ago) (2 children)

In your opinion, is PyTorch easier than something like TF? What do you think about Keras?

[–] 4shtonButcher@discuss.tchncs.de 2 points 2 years ago

I’m not personally coding with them, just often supporting people and their projects that do. Keras is also popular but I’ve at least personally seen slightly shoddier implementations with it. That could be selection bias though.

[–] jacksilver@lemmy.world 2 points 2 years ago

I personally think Keras has a nice and intuitive high level API for getting into nueral networks, but Pytorch is definitely the most prominent library. If your going to start somewhere you're not going to regret learning Pytorch.

That being said, as others have mentioned, if you want to be a good data scientist or ML practioner learning the basics is never a bad idea. Sklearn is still the best library for a lot of ML tasks and is good to be familiar with.

There are a couple of good books out there that start off with the basics using numpy, pandas, Sklearn and build up to nueral networks/deep learning. I've use this one in the past https://www.amazon.com/Machine-Learning-PyTorch-Scikit-Learn-learning/dp/1801819319.

[–] hulemy@ani.social 4 points 2 years ago

Sklearn has those built in graphs and chart displays

[–] Scrath@feddit.de 4 points 2 years ago (1 children)

It's been a while since I last looked into those.

If you aren't looking for neural networks I found sklearn to be quite capable and easy to understand.

I also tried tensorflow and pytorch a couple times (not enough to get really proficient in them) and I think I found pytorch the hardest to wrap my head around. It's been quite a while though so maybe it's better to listen to others with more experience in that regard.

[–] Asudox@lemmy.world 1 points 2 years ago
[–] Artyom@lemm.ee 2 points 2 years ago

Sklearn for most of the data handling, pytorch for the model. They're designed to be useable together.

[–] chepox@sopuli.xyz 2 points 2 years ago (2 children)
[–] silas@programming.dev 16 points 2 years ago

Microwaveable Legos

[–] SteveTech@programming.dev 5 points 2 years ago

Machine Learning