The road to LLM ~What is machine learning anyway? With an explanation of this project~[Day 1]

  1. What is the advent calendar like?
  2. Intended readers
  3. What is Machine Learning?
  4. What’s great about machine learning?
  5. About pre-trained models
  6. Conclusion

🚀 Hello everyone! This is Kbilel 

Over the past few months, I’ve dived deep into the fascinating world of generative AI. 🤖 Starting today, as part of my Advent Calendar for Q4 2024 project, I’m excited to share everything I’ve learned. From the basics to the brain-benders, there’s something here for everyone! 

🎉 This is my debut in planning an Advent Calendar, and I’m pumped to bring you a series that’s both enlightening and fun. 

Stay tuned, dive in, and let’s explore the future of AI together! 🌟 

What is the advent calendar like?


Have you ever heard of the term LLM? It stands for Large Language Model . From now on, we will use the abbreviation LLM. Some of you may have heard of it when you saw or heard about ChatGPT on the news.

In this advent calendar, I will pick up and write about important points leading up to the LLM. In addition to explaining GPT, which is one of the LLMs, I will also explain Transformer, which is necessary to understand GPT. In addition, starting today, I will explain the basic knowledge of machine learning, which is necessary to explain Transformer.

In other words, I intend to write so that the final result will be as follows:

  • Understand the natural language processing models used to reach the LLM level, and become able to say that you « understand » ChatGPT (including its contents).
  • When similar services like ChatGPT are released, you will be able to understand them quickly.

Intended readers


This course is not for people who just hear the word LLM and immediately know what it is, or who have read a paper about the model inside. (Of course, it would be very helpful if you could give us feedback after reading it!)

The intended readers are as follows:

  • Machine Learning Beginner
    • People who are wondering, « What is machine learning anyway? »
  • People who know about ChatGPT but don’t understand what’s so great about it or what goes on behind the scenes.
    • People who have difficulty explaining « What is ChatGPT? » while also briefly explaining the underlying mechanisms.
  • People who are wondering « What is GPT? »
  • Those who want to get an image

These are the intended readers.

So far, we have explained the details of this advent calendar.

Here is a brief overview of what machine learning actually does.

What is Machine Learning?

If you want to know what machine learning is, the following AWS page is helpful.

https://aws.amazon.com/jp/what-is/machine-learning/: The road to LLM ~What is machine learning anyway? With an explanation of this project~[Day 1]

機械学習とはThe following passage is a quote from the passage written in that section of the above page .

Machine learning is the science of developing algorithms and statistical models that computer systems use to perform tasks without explicit instructions, relying instead on patterns and inference.

Computer systems use machine learning algorithms to process large amounts of historical data and identify patterns in the data, which allow them to more accurately predict outcomes from a given set of input data.

For example, data scientists can train a medical application to diagnose cancer from X-ray images by storing millions of scan images and their corresponding diagnostic information.

Let’s try to understand the above content with an image.

The first step is to process large amounts of data, which is called learning in the field of machine learning.

The next step is to make predictions based on the results of learning from large amounts of data. In the field of machine learning, this is called inference .

I hope you found this article helpful. I have given you a simple image of the parts of machine learning called learning and inference.

Now let’s think about why we need machine learning!

What’s great about machine learning?


Let’s say you have the following problem:

You work as a manager at a retail store. It's the height of summer, and ice cream sales are going strong.

After checking the number of units sold in past ledger records, it seems that temperature and weather are related to sales.

When the weather forecast announces the temperature and weather for the next week, you'd like a computer to predict the appropriate number of units to purchase instead of you, because it's a pain to think about it yourself every time you need to purchase more.

What should you do?

In the case of programming without machine learning, for example, it is possible to « return a fixed value according to the temperature range and the weather conditions of sunny, cloudy, and rainy. » However, as more conditions come into play, such as « a certain commercial has become popular and sales have increased significantly » or « the number of items sold differs depending on age, so we want to take that into account, » programming becomes increasingly difficult. (Writing a program and continuing to maintain it is also difficult.)

Of course, it is assumed that it exists in the form of data , but machine learning can learn from that situation (data) and give a prediction (inference result) such as « Shouldn’t we purchase this much?  » By the way, if you are interested in this demand prediction, you may find interesting information by searching « opportunity loss machine learning ».

As mentioned above, machine learning can output data according to various situations. Other examples include the following:

  • Determine if someone is sick from medical scan data
  • Determine whether a certain Chinese sentence is a positive or negative expression
  • Classifying multiple data based on their characteristics
  • A computer determines what characters a person has written
  • Generate 3D models from images

The above are just a few examples, but once you have cleared the large amount of data required to use machine learning and it can learn, it will undoubtedly be a powerful ally.

However, collecting data for learning and the learning itself takes a lot of effort. Try searching for prices such as « machine learning GPU price » on your PC or smartphone and you will find that it costs tens of thousands of yen. Considering the increasing amount and type of data handled in machine learning in recent years, and the increasing number of settings for machine learning models, such as parameters used for learning, I think it is necessary to consider the cost not just tens of thousands of yen but millions of yen, and in some cases even more than tens of millions of yen when other factors necessary for operation are taken into account. It is not a price that an individual can easily afford (at least, that is not the case for me).

Here’s the good news: machine learning makes it possible to make inferences using models that someone else has trained.

About pre-trained models


Because training requires a lot of money and effort, a model that has already been trained by someone else, known as a pre-trained model, may be used. If you use this, lightweight models will be able to run on even personal laptops.

The following websites publish pre-trained models.

https://huggingface.co/: The road to LLM ~What is machine learning anyway? With an explanation of this project~[Day 1]

It is possible to create your own model by retraining the machine learning model published on the above site using a small amount of data that you have. This is called fine tuning .

Conclusion

Today, I used images to explain « What is machine learning, anyway? What does it do? » We will gradually get more serious and eventually delve into the machine learning model known as LLM, asking « What does it actually do? »

Also, starting next days, we will be talking about loss functions, activation functions, and evaluation metrics, which are the basic parts of machine learning.

That’s all for today. Please come back next days if you’d like!


Une réflexion sur “The road to LLM ~What is machine learning anyway? With an explanation of this project~[Day 1]

Laisser un commentaire