profiq Video: Evaluating LLMs with MLflow by Miloš Švaňa
Anke Corbin
3 months ago 21.8.2024
Are you developing an application and looking to integrate large language model (LLM) features? With multiple options like GPT, Gemini, Claude, and open-source models from Hugging Face, choosing the right solution can be overwhelming. Each model offers unique strengths, from GPT’s versatile text generation to Claude’s detailed descriptions, and an open-source model’s flexibility.
Integrating LLM features can significantly enhance your application by providing capabilities such as natural language understanding, text generation, and intelligent automation. To make an informed decision, it’s essential to evaluate your application’s specific needs, compare model performances, and track data systematically. Tools like MLflow can help you monitor and compare the effectiveness of different models, ensuring you select the best fit for your project.
To dive deeper into evaluating LLMs and effectively tracking experiments, check out this video by profiq’s Miloš Švaňa. Miloš is an ML engineer and researcher at profiq who specializes in large language models, deep learning, classical ML, and statistics. Miloš demonstrates practical techniques for using similarity metrics and MLflow, equipping you with the knowledge for making the best choice when integrating LLM features into your application.
Here are some useful timestamps in the video:
0:20 Intro, options to choose from when integrating LLM functionality
0:40 Using an LLM to Analyze HTML Page and Extract A Title
1:17 Prompt Template
4:37 Objectively comparing results from different models
9:19 Calculating average similarity across all websites in data set
10:15 Introducing MLflow
12:00 Starting a New Run
15:12 Results of Runs in MLflow UI
15:33 Results according to Model
18:14 Different ways to explore and sort results
19:05 Summary and Conclusion
You May Also Like:
profiq Video: MLflow: serving LLMs and prompt engineering by Miloš Švaňa
profiq Video: Tech demo – Autonomous Agents using LLMs by Viktor Nawrath
profiq Video: Training your own speech-to-speech AI model
19 of the best large language models in 2024
If you’d like to talk about how to build out your innovative software projects and how to incorporate AI, please contact us.