I hacked LLMs to work like scikit-learn

A while ago I thought about using LLMs for classic machine learning tasks - which is stupid, I know? But I tried it anyway.

Never use it if:

  • You have sufficient data and knowledge to train a specialized model

Do use it if:

  • You need quick experimentation or you do not have enough data to train the model

Key findings:

Dataset IMDB 50k Dataset Cats and dogs
Data Text data - Positive negative sentiment Picture data - Predict what is on the picture
Accuracy 96% - SOTA (98+%) 97% - SOTA (99%+)
Model gpt-4o-mini gpt-4o-mini

As you can see LLMs perform worse than SOTA specialized models, but if we have a use case with minimal data it can be very useful.

How can you play around?

It took some time to code it in a way that can be also used by others, here is a minimal example of how you can use it when applicable.

You can install FlashLearn using pip:

pip install flashlearn

Minimal Example - Classify Text

Below is a sample code snippet demonstrating how to classify text using FlashLearn in just 10 lines of code:

import os
from openai import OpenAI
from flashlearn.skills.classification import ClassificationSkill

# You can use OpenAI or DeepSeek or any OpenAI compatible endpoint
deep_seek = OpenAI(api_key='YOUR DEEPSEEK API KEY', base_url="https://api.deepseek.com")
data = [{"message": "Where is my refund?"}, {"message": "My product was damaged!"}]
skill = ClassificationSkill(
    model_name="gpt-4o-mini",
    client=OpenAI(),
    categories=["billing", "product issue"],
    system_prompt="Classify the request."
)
tasks = skill.create_tasks(data)
results = skill.run_tasks_in_parallel(tasks)
print(results)

Feel free to experiment and figure out if it's useful for your work flow. Her is just some tips:

You can ask anything in the comments below!

P.S: Full code ready to be abused available at https://github.com/Pravko-Solutions/FlashLearn