Book Image

R Machine Learning Essentials

By : Michele Usuelli
Book Image

R Machine Learning Essentials

By: Michele Usuelli

Overview of this book

Table of Contents (15 chapters)
R Machine Learning Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

A data-driven approach in business decisions


Expertise and information play important roles in business decisions. This section shows how data-driven technologies changed the approach of facing challenges and improved their solutions.

Business decisions come from knowledge and expertise

The general idea for approaching business problems hasn't changed over the years, and it combines knowledge and information. Before using digital technologies, knowledge came from expertise provided by previous experiences and by other people. With regards to information, it was about analyzing the current situation and comparing it with past events.

A simple example is that of a fruit monger who wants to set the prices of their goods. The price of a product should maximize the profit, which depends on the sales volume and on the price itself. The dealer started their job working with their father who provided them with all their knowledge. Therefore, they already know the price of the different fruits. In addition, at the end of each day, they can observe the amount of each fruit that has been sold. Based on that, they can raise the price of fruits that sold very well and decrease the price of fruits that they didn't sell. This simple example shows how the fruit monger combines domain knowledge and information to solve their problem, as described in the following figure:

This simple example shows how a simple challenge requires a combination of knowledge and data.

The digital era provides more data and expertise

Although the general idea for approaching business problems hasn't changed, digital technologies are providing us with new powerful tools.

The Internet allows people to connect with each other and share their expertise in such a way that everyone has access to a huge set of information. Before the Internet, knowledge came from trusted people and books. Now, the spreading of information has allowed finding books and articles written by different people from every part of the world. In addition, websites and forums allow their users to connect with each other in order to share expertise and find quick answers.

Digital technologies keep track of different activities and produce a lot of related data. We talk about data referring to sets of information—quantitative or qualitative—which is processable by machines. Therefore, when facing a business problem, we can use lots of data from different sources. Some information might not be very relevant, but even after removing it, we often have a huge amount of data. Therefore, we have a lot of improvement potential for the results.

The changes derived from digital technologies involve the process of acquiring expertise and the nature of data. Therefore, the approach to problem solving presents new challenges.

A simple example of a company that faces a business problem is a car dealer who sells different used cars and wants to set the most relevant prices. The car dealer should determine the prices based on the car model, age, and other features. This example is meant to illustrate a possible situation and is not necessarily related to a real problem.

The car dealer needs to identify the best price for each car in order to maximize the revenue. Similar to the fruit monger, if the price of a car is too high, the car dealer won't sell it in a short time, so there will be an extra storage cost and the car will lose value. This leads to an extra cost and a decrease in the profit, thereby damaging the business. On the other hand, if the price is too low, the company will sell the car immediately. Although the storage cost is lower, the company hasn't made the best profit. In order to sell cars and maximize profit, the company wants to figure out the optimal prices.

Let's take a look at the expertise and information that help in finding the solution. The company can use:

  • The knowledge of agents who have already sold different cars

  • Information from the Internet

  • The data about previous sales

The agents can use their past experience, so their knowledge helps in identifying the best prices. However, it's not enough to set the prices when the market changes quickly.

The Internet gives us a lot of information since there are many online shopping websites displaying the prices of used cars. Online shopping is different from the physical market, but an expert agent can take a look at the websites and compare the prices. In this way, the agent can combine their expertise with the online information and identify the right prices in a good way.

This approach leads to good results, but is still not optimal. Looking at different websites is time-consuming, especially if there are many categories of cars, so it is hard or even impossible to check the prices on a daily basis. Another issue is that there might be many websites, making it impossible for a single person to process all the information. By automating our web research and using data more systematically, we can acquire information much faster.

To acquire information, the data sources are the company sales and the online market, and a good solution for car pricing should take into account all these sources. The company sales data shows how the customers reacted to their prices in the past. For instance, we know how long it took to sell each car in the past. If it took too long, the price might have been too high. This criterion is objective and an expert agent can use this information to identify the current wrong prices.

The data derived from online shopping websites displays the car prices, and we can use tools that can store a price and sales history. Although this information is less relevant to the problem, it can be processed similar to the company sales data, thereby improving the result accuracy, as described in the following figure:

This example shows the potential of having more information and expertise. The challenge here is to use information in the most proper way to improve the solution. As a general rule, the more information we use, the more accurate the results can potentially be. In the worst case, we have a lot of irrelevant information and we can identify and use a small relevant part of it.

Technology connects data and businesses

A single person can solve a business problem by combining data and expertise as long as the data is understandable by the human mind. The growth of data volumes due to digital technologies has changed the way of approaching problems since more data requires new tools in order to be used. In addition, new devices allow us to perform data analysis that would have been impossible on personal computers 10 years ago.

This fact not only changed the way of dealing with data, but also the overall process of making business decisions.

There are several ways to use the information contained in the data. For instance, the Internet movie streaming provider Netflix uses a tool that produces personalized movie recommendations based on your interests. Machine learning refers to the tools that learn from data to provide insights and actions, and it is a subfield of artificial intelligence. Machine learning techniques don't just process data, but rather connect data and the business. This interaction between information and knowledge is crucial and affects almost each step of building solutions.

Knowledge still plays an important role in building the tool that identifies the solution. Since there are many machine learning tools that deal with the same problem, your expertise can be used to choose the most relevant tool. In addition, most of the tools have some parameters, so it's necessary to know the problem to set them up, as described in the following figure:

After the machine learning technique has identified a result, we can validate its performance using information and expertise. For instance, in the car dealer example, we can build a tool that automatically identifies the best prices and predict the necessary time to sell each car. Starting from the previous data, we can use the tool to estimate how long it would have taken to sell cars and compare the estimated time with the real time. In addition, we can identify the current prices and use knowledge and expertise to see if they are reasonable. In this way, we compare how similar the machine learning approach is to reality.

Validation helps in comparing different techniques and choosing the one that performs best. In addition, techniques usually need a setup with different options, and validation helps in choosing the most proper option, as described in the following figure:

In conclusion, the interaction between machine learning and business is extremely important, and it takes place in each part of the process of building the solution.