What Is Data Mining and How Is It Performed?

The development and spread of technology has also changed our habits. Many jobs and processes we did physically before,

we do it from a computer, mobile phone or tablet now. We can perform bank payments, shopping, hospital appointments and many more with a few clicks on the phone. As a result of every transaction we do over the internet with our phone or computer, we accumulate some data on the opposite side. We call “data mining” the job of collecting and analyzing this data accumulated on the servers of businesses, and sorting out “useful” ones. As GTech, we have clarified the issues about data mining and how it is performed in this article.

How is Data Collected?

There are 2 main ways to collect data. The first of these ways is the “Open data collection” method, which is performed with your permission and based on the information you have entered or your behavior within the site. The other method is called “Closed data collection” and it is usually the data collected on sites such as social networks and search engines. The purpose of this type of data collection method is to follow all behaviors on that site, determine the interests of the user and carry out marketing activities accordingly.

What Are Data Mining Processes?

  • Data Filtering: It is the stage of determining the data to be used in mining.
  • Data Cleaning: It is the step of removing unnecessary, inconsistent or noisy ones from the collected data.
  • Data Merging: Data obtained from different sources and with similar characteristics or related data are combined in this step.
  • Data Selection: The process of selecting the cleaned and merged data suitable for analysis is carried out in this step.
  • Data Conversion: It is the stage of converting the available data into a form suitable for mining.
  • Mining Study: At this stage, appropriate data mining algorithms are applied on previously prepared data.
  • Interpretation and Verification: After the data mining application is carried out, the results obtained are interpreted and research is carried out on the accuracy of these results. The verification process is carried out by comparing the results obtained from different applications.

What Are Data Mining Methods?

  • Classification: One of the most widely used methods in data mining is the classification method. This method is to examine the properties of the available data and transfer it to the appropriate one among the previously determined classes.
  • Association Rules: This is another method commonly used in data mining. This method aims to identify the interconnected data contained in large databases and the connections between them.
  • Clustering: The purpose of this data mining method is to separate the data into subclasses according to the relationships among themselves.
  • Estimation: It is a data mining method based on the estimation of numerical data that is missing in a data set.
  • Contradiction Analysis: It is the method of detecting excessive deviations in the data. Unusual expenditures from credit cards are detected with this method.

In Which Areas Is Data Mining Used?

There is no limitation regarding the usage areas of data mining, and data mining can be performed wherever data is accumulated. If we look at the areas where data mining is used most widely today;

  • Banking
    • Determining customers according to their credit card usage habits.
    • Making evaluations on loan requests.
  • Marketing
    • Determining the purchasing habits of individuals.
    • Sales Estimation
    • Market Basket Analysis
  • CRM:
    • Increasing customer loyalty.
    • Making efforts to make the most of marketing campaigns.
  • E-Trade:
    • Detecting attacks on servers.
    • Determining the behavior of users browsing the website.
  • Insurance Business
    • Determination of insurance risk groups.

GTech provides end-to-end service in the implementation of Data Warehouse and Business Intelligence solutions that will enable organizations to make strategic decisions by analyzing their data, and in the implementation of modern data-driven decision support systems.