Power Business Intelligence In The Data Science Visualization Process To Forecast CPO Prices

. Forecasting is one of the techniques in data mining by utilizing the data available in the data warehouse. With the development of science, forecasting techniques have also entered the computational field where the forecasting technique uses the artificial neural network (ANN) method. Where is the method for simple forecasting using the Time Series method. However, the ability to create data visualizations certainly hinders researchers from maximizing research results. Of course, with the development of the Power BI software, the data science process is more neatly presented in the form of visualization, where the data science process involves various fields so that in this paper the results of forecasting the price of crude palm oil (CPO) are presented for the development of the CPO business with the hope of implementing the Business Process. intelligence (BI) by involving ANN, namely the time series for forecasting. From the final results, accuracy in forecasting with time series involves 2 accuracy techniques, the first using MAPE and getting a result of 0.03214% and the second using MSE to get 962.91 results.


INTRODUCTION
Data Science is an amalgamation of several sciences involving mathematics, statistics and computation which aim together to perform dataset analysis of big data sets [1][2] [3]. The application of data science is applied with several certain algorithms to obtain data patterns and perform accurate forecasting so that it is useful to support decision making from business processes [4] [5]. Where forecasting is one of the techniques in data mining that is useful for making knowledge of data to be able to predict future data with learning from previous datasets or data [6]. Like what Lubis [7] did forecasting big data using the nearest neighbour method, in his research, he got the optimal time for the 40000 credit transaction data in order to provide credit and get optimal results with 16 minutes to conduct training and gain new knowledge. In addition, Rahmat [8] conducted research using big data to predict gold prices using the multilayer perceptron evolving method, which is a development of the multilayer perceptron method by adding virtual nodes to avoid epoch repetitions and obtaining MAPE in forecasting accuracy of 0.769%.The forecasting process also aims to create a ISSN: 2722 -4015 http://ijstm.inarah.co.id smart system by implementing machine learning in its final results [9]. Machine learning techniques are not much different from data mining techniques [10].
So that the application of forecasting involves some optimal algorithms in learning previous data [11]. In accordance with its application, the data science process is useful for simplifying business processes by predicting data based on previous data and past times so that time series algorithms can be used as the application of data mining as a predictor of business process sustainability [12].Time series is an observation in an event, symptom or change that occurs from time to time with the benefit of obtaining patterns, forecasting and predictions in analyzing datasets in the past to obtain data in the future [13]. In business processes using time series has been tested in various business processes, such as Lubis [14] doing forecasting with time series based on changes in gold prices but to get accuracy combined with false alarm rates to get 0.466 results. In addition, Gao & Duru [15] in their research applied a mathematical modeling to ensure the value of the fuzzy model in the time series algorithm by obtaining out-of-sample accuracy values at a certain level in the fuzzy time series model which in conclusion explains hyper-parameters such as time lag and partitions consisting of the number of fuzzy sets, partition types and membership functions with the fuzzy time series modeling approach are very influential using genetic algorithms (GA) as their implementation. However, there are so many business processes that can be applied, one of the suitable data developed in the advancement of data science is crude palm oil (CPO) [16]. Where CPO is a business that is available in the commodity market or in scientific natural data sources that are allowed to be processed into derivative products in the form of consumed goods and raw materials for industry. Agriculture, especially CPO, has been implemented in many computational techniques [17].
As was done by Al-Khowarizmi [18], forecasting CPO prices using the Simple Evolving Connectionist System method with the aim of changing existing business processes in plantations by applying computational techniques where the CPO sales process uses an agent in tendering CPO prices, so that operational costs are too high and implemented an online tender process with a forecasting value of CPO prices for the next 2 weeks because the CPO harvest process is about 2 weeks. However, the forecasting process gets an accuracy of 0.035% which is calculated using the MAPE formula. In addition, to get an accurate business process on CPO, Al-Khowarizmi [19] developed again by testing the accuracy of forecasting CPO prices on a different method, namely KNN by obtaining MAPE sensitivity which involves detection rates and obtaining more optimal results of 0.000361%.The two studies seem to have the same goal so that the application of business intelligence (BI) can be applied in the agricultural sector. In addition, BI can be relied on in business because BI focuses on systems and technology that can collect data from several sources that can be processed into a form of information for business needs [20]. So that reports from BI to be easily accepted by stakeholders are in the form of visualization that is used for data presentation with structural candidates with graphics to display information from hidden data [21]. As for its use in BI, software as a reference in this paper uses Power BI provided by Microsoft to make it easier to present business needs [22]. So that in this paper forecasting CPO prices using time series and the results are presented with Power BI to facilitate managerial or stakeholder in assisting decision making in the hope that data processing with data mining techniques becomes a reference in data science.

II.
MATERIAL AND METHODS Dataset In this paper, forecasting using the time series method requires data for training and testing where the dataset in this paper is obtained from www.investing.com, namely CPO price data from January 2 nd 2020 to December 30 th 2020. From this data, 75% training was conducted. and testing by 20% consisting of 242 datasets.
Time Series Forecasting CPO Time Series was first developed by Song and Chissom in 1993 [23]. Time Series is a data forecasting method that uses fuzzy principles as the basis. Roughly speaking, a fuzzy set can be interpreted as a class of numbers with cryptic limitations [24]. If universe of discourse (U) is the set of universes, U=[u1,u2,…,up ]. then a fuzzy set of U with the degree of membership is generally stated as follows: Ai = μAi (u1) / u1 + ⋯ + μAp (up) / up where μAi ui is the degree of membership from ui to Ai, where μAi ui ∈ [0,1] and 1 ≤ ≤ . The value of the degree of membership of the μAi ui is defined as follows [25]: (1) This can be illustrated by the following rules [26]: Rule 1 st : If the actual data is included in , then the degree of membership for is 1, and + 1 is 0.5 and if and + 1 are not, it means zero. Rule 2 nd : If the actual data falls under , 1 ≤ ≤ then the degree of membership for is 1, for − 1 and + 1 is 0.5 and if it is not , − 1 and + 1 are expressed zero. Rule 3 rd : If the actual data is included in , then the degree of membership for is 1, and for − 1 is 0.5 and if it is not and 1 − 1 it means zero.
Then in obtaining forecasting accuracy, it is done with 2 tests, the first uses MAPE (Mean Absolute Percentage Error) and the second uses MSE (Mean Squared Error). The MAPE is calculated by equation (2) below [27].
Furthermore, it is also tested with MSE in equation (3) below [28].

III. RESULT AND DISCUSSION
At this stage, the data crawling process is carried out from www.investing.com in the form of CPO price data that will be forecasted. The data was taken from January 2 nd 2020 to December 30 th 2020. The data is shown in table 1 below. From table 1 it can be seen that there is a date field, then there is the last price, which is the data on that date, it closes at 23.59 GMT+7. In addition, there is the highest price which is the maximum price on that date and the lowest price which is the minimum price from that date. Table 1 has the highest price, lowest price and closing price because on that date every second and minute the price changes.After crawling the data and making it a dataset, then the dataset is stored in the data warehouse so that forecasting can be done easily. However, the forecasting process in this paper consists of training and testing in which the dataset for training is 75% and testing is 25%. Forecasting is done using a time series algorithm which is implemented using python with a brief syntax as follows: import warnings warnings.filterwarnings('ignore') import numpy as nphy import pandas as pdas import matplotlib.pyplot as mplt import seaborn as sns from dateutil.relativedelta import relativedelta from scipy.optimize import minimize import statsmodels.formula.api as smfapi import statsmodels.tsa.api as smtapi import statsmodels.api as smapi import scipy.stats as scst from itertools import product from tqdm import tqdm_notebook %matplotlib inline  figure(figsize=(15, 7)) mplt.plot(currency.GEMS_GEMS_SPENT) mplt.grid(True) mplt.show() // MAPE and MSE Accuracy from sklearn.metrics import r2_score, median_absolute_error, mean_absolute_error from sklearn.metrics import median_absolute_error, mean_squared_error, mean_squared_log_error def mean_absolute_percentage_error(y_true, y_pred): return nphy.mean(nphy.abs((y_true -y_pred) / y_true)) * 100 Above is a program syntax using Python tools with the hope that this syntax is combined with Power BI tools. So that by using this time series get the forecasting results as shown in Table 2 below. From table 2, it can be seen that the results of CPO price forecasting where 25% of the test dataset is based on 75% of the training dataset which is summarized from January 2, 2020 to September 28, 2020 is a training dataset and from September 29 2020 to December 30, 2020 is a test dataset. From the results of these tests, the calculation of accuracy is carried out which in this paper is tested on 2 accuracy techniques, the first using MAPE and the second using MSE. Where in calculating accuracy is used to get the percentage of errors in training. The results of the accuracy using MAPE are as follows: MAPE results in forecasting CPO prices using a time series algorithm of 0.3214%. Then the MAPE value is compared to MSE and get the results.
The MSE result is 962.91. so that the accuracy value both using MAPE and MSE is dynamic or can change based on the number of training and testing datasets in forecasting. From these results, data is provided using Power BI tools to get a presentation based on visualization where the forecasting results are shown in Figure 2.  Figure 2. It can be seen that the Power BI display is easier to accept for users because the user interface data display is more flexible, the information presented is more effective, can help make decisions more quickly and accurately and make it easy to control real time data on one screen.Thus, based on the final results of this paper, Power BI is an application that is capable of integrating with the Python syntax so that the Python syntax for forecasting with the time series algorithm as above can be stated in Power BI so that decision makers in the business can easily forecast using Power tools. BI only. Figure 3 shows the results of visualization of CPO price forecasting using the time series algorithm. Fig 3. Phyton time series forecasting With the process as shown in Figure 3, it makes it easier for application users, especially business policy makers, to forecast using the time series algorithm. However, in its application, it is necessary to test dynamic datasets such as multiplying the dataset in training so that the accuracy value gets very small errors so that the repetitive process can become a data science process and reference so that the data structure changes dynamically.

IV. CONCLUSIONS
In summary, this paper forecasts CPO prices using the time series algorithm and then contributes to computational techniques by making comparisons in obtaining accuracy values using the MAPE formula and the MSE formula. Where the forecasting consists of 242 datasets consisting of 75% training dataset and 25% testing dataset. From a total of 25% of the test dataset got a MAPE value of 0.03214% and an MSE of 962.61. The accuracy value can change according to the amount of training and testing in forecasting. Forecasting CPO prices using the time series algorithm is implemented with the Python application and in this paper the Python syntax is outlined in Power BI so that the visualization results make it easier for business policy makers to support decision making. With a concept like in this paper, the forecasting process that is carried out repeatedly can create a data science media because it is able to pour business processes from various data and fields.

V. ACKNOWLEDGMENTS
We would like to thank all those who have contributed to this research.