1、随着信息技术的不断发展,人们利用信息技术处理数据的能力大幅度提高,越来越多的数据库被应用于商业管理、生产控制和工程设计等各种领域。但是,面对不断增加的各种复杂数据,已存在的数据库的查询功能已经不能满足人们的需要,能不能从数据中提取人们所需要的信息和知识是大家越来越关注的问题。传统的统计技术已面临极大的挑战,集统计学、数据库、知识发现等技术于一身的数据挖掘技术应运而生。近几年来,数据挖掘技术在零售业、直效行销界、制造业、财务金融保险、通讯业以及医疗服务等领域应用广泛。 一、数据挖掘的基本概念 (一)“啤酒尿布”的典型案例 在了解数据挖掘的概念之前,我们先来看一个“啤酒尿布
2、的故事。故事的主角是沃尔玛这个世界上最大的零售商,在其遍布美国数千家超级市场中,小孩尿布与啤酒居然并排摆放在邻近的货价上一起销售,而且两者销量都还不错。原来沃尔玛通过建立的数据仓库,分析了原始交易数据,按周期统计产品的销售信息,然后利用数据挖掘工具进行分析和挖掘,结果发现,每逢周末沃尔玛连锁超市啤酒和尿布的销量很大。进一步调查表明,在美国有孩子的家庭中,太太经常嘱咐她们的丈夫下班后要为孩子买尿布,而丈夫们在买完尿布后又顺手带回了自己爱喝的啤酒,因此啤酒和尿布一起购买的机会是最多的。之后该店打破常规,将啤酒和尿布的货架放在了一起,使得啤酒和尿布的销量进一步增长。啤酒和尿布这两者看似毫无关联,但
3、在特定的条件下,它们之间却有密切的关系,这就是数据挖掘技术。.. (二)数据挖掘的概念 数据挖掘(Data Mining)就是从海量的原始数据中,找出隐含在其中的、我们事先不知道的、但又是潜在的有意义的知识和信息,从而利用这些知识来指导我们的活动。从统计学的角度,数据挖掘可以看成是通过计算机对大量的复杂数据的自动探索性分析。随着信息技术的高速发展,人们积累的数据量急剧增长。数据挖掘就是为顺应这种需要应运而生发展起来的数据处理技术。 二、零售业应用数据挖掘的背景 零售业客户关系管理((Customer Relationship Management。CR
4、M)是一种以客户为中心的市场营销理念和策略。CRM的目标是缩减销售周期和销售成本、增加收入、寻找扩展业务所需的新市场和渠道以及提高客户的价格、满意度、盈利性和忠诚度。零售业客户关系管理主要通过条形码、销售管理系统、客户资料管理系统等各种途径获得关于商品信息、客户信息、供应商信息及店铺信息等大量的数据信息,如何利用这些海量数据信息分析出哪些商品好卖、哪些商品不好卖、哪些客户适宜哪些商品、商品之间如何搭配,是令零售商头疼的问题。利用数据挖掘工具对这些数据进行分析,可以帮助零售商进行科学的决策,分析哪些商品顾客最有希望一起购买,从而将这些商品摆放在一起;分析商品的销售趋势,从而给零售商提供进货建议;
5、分析购买商品的人员信息,从而帮助零售商选择店铺的所在地点等。 三、数据挖掘技术的常用算法 数据挖掘是零售业CRM中的核心技术,通过分析顾客已购买商品及这些商品之间的内在联系,确定顾客的购买习惯和关联购买倾向,从而帮助零售商制定营销策略。为了实现在零售业.. CRM中的应用,数据挖掘技术中主要涉及以下常用算法:.. (一)聚类分析算法 聚类分析算法是根据事物的特征对其进行聚类或分类,即所谓物以类聚,以期从中发现规律和典型模式。在零售业中,聚类分析可以帮助市场分析人员从消费者数据库中区分出不同的消费群体来,并且概括出每一类消费者的消费模式或者说习惯。..
6、 (二)决策树算法 决策树算法就是利用训练集生成一个测试函数,根据不同取值建立树的分支;在每个分支子集中重复建立下层结点和分支,这样便生成一棵决策树。然后对决策树进行剪枝处理,最后把决策树转化为规则。决策树算法常用于预测模型,它通过将大量数据有目的分类,从中找到一些有价值的,潜在的信息。它分类速度快,特别适合大规模的数据分类处理。.. (三)神经网络算法 神经网络算法能够模拟人的神经元功能,经过输人层、隐藏层、输出层等,对数据进行调整、计算,最后得到结果。神经网络算法的优点是它能精确地对复杂问题进行预测。它本身具有良好的鲁棒性、自适应性和高度容错性。..
7、 (四)关联规则挖掘算法 关联规则挖掘是用于发现数据库中属性之间的相关联系的一种算法。关联规则发现任务的本质是在数据库中发现强关联规则,利用这些关联规则了解客户的行为,其最典型的例子就是购物篮分析。 四、数据挖掘技术在零售业中的应用 随着日益增长的Web或电子商务方式的兴起,零售业CRM是数据挖掘的主要应用领域。数据挖掘技术可有助于识别客户购买行为,发现客户购买模式和趋势,改进服务质量,取得更好的客户保持力和满意度,提高货品销量比率,设计更好的货品运输与分销策略,减少商业成本。数据挖掘技术在零售业CRM中的应用主要体现在以下几个方面: (一)使用多特征数据立
8、方体进行销售、客户、产品、时间和地区的多维分析 多维数据分析是指通过多维的方式对数据进行分析、查询和报表。维是人们观察数据的特定角度。例如,企业在考虑产品的销售情况时,通常从客户、产品、时间和地区等不同角度来深入观察产品的销售情况。这里的客户、产品、时间和地区就是维。根据这些维的不同组合和所考察的度量指标从客户基本库中发现不同的客户群,以便决策者根据主客户群的特征作相应的定货、销售、服务等决策。.. (二)利用关联分析挖掘关联信息进行购买推荐和商品参照 关联分析就是利用关联规则进行数据挖掘技术,其目的在于挖掘隐藏在数据间的相互关系,发现数据库中形如.. “90%的顾
9、客在一次购买活动中购买A商品的同时也会购买B商品”之类的知识。通过从销售记录中挖掘关联信息,可以发现买某一品牌商品的顾客很可能购买其他一些商品。这类信息可用于形成一定的购买推荐。商家通过宣传可改进服务,帮助顾客选择商品,增加销售额和减少库存积压。 (三)使用多维分析和关联分析进行促销活动的有效性分析 利用多维分析和关联分析对数据库的数据仔细研究,以分析顾客的购买习惯、广告成功率和其他战略性信息。利用数据库通过检索数据库中近年来的销售数据,用多维关联分析方法,通过比较促销期间的销售量和交易数量与促销活动前后的有关情况,可预测出季节性和月销售量,对商品品种和库存的趋势进行分析还可确
10、定降价商品,并对数量和运作做出决策。此外,用关联分析可以找出哪些商品可以用于促销活动,便于安排商品货源,提高销售额。 (四)序列模式挖掘可用于客户忠诚分析 序列模式分析和关联分析相似,但侧重点在于分析数据间的前后序列关系。序列模式挖掘可用于分析顾客的消费或忠诚的变化,据此对价格和商品的品种加以调整,以留住老顾客,吸引新客户,保证一定的顾客数量。商家可以从原客户后来却转成竞争对手的客户群中,分析其特征,再根据分析结果到现有客户数据中找出可能转向的客户,然后设计一些方法预防客户流失;也可以根据客户的消费行为与交易纪录对客户忠诚度进行排序,根据流失率的等级进而配合不同的策略。
11、 (五)利用交叉销售模式向老客户销售新的产品或服务 零售业和客户之间的关系是持续不断的、发展的,交叉销售是指向老客户销售新的产品或服务的过程。交叉销售是建立在买卖双方互利原则的基础之上,客户因得到更多更好符合他们需求的服务而获益,企业也因销售增长而获益。交叉销售的优势在于,商家可以比较容易得到老客户较为丰富的信息。企业所掌握的客户信息特别是以前购买行为的信息中,可能正包含着决定这个客户下一个购买行为的关键信息。这个时候数据挖掘的作用就体现为它可以帮助企业寻找到那些影响顾客购买行为的信息和因素。 As information technology continues
12、 to evolve, people use information technology to process data of the large increase in capacity, more and more databases are used in business management, production control and engineering design and other areas. However, the face of ever-increasing variety of complex data, already exists in the dat
13、abase query functions can no longer meet people's needs, can extract from the data that people need information and knowledge is our growing concern. Traditional statistical techniques have been facing tremendous challenges, set of statistics, databases, knowledge discovery technology in a data mini
14、ng technologies have emerged. In recent years, data mining technology in the retail, direct marketing, manufacturing, finance finance, insurance, communications and medical services, etc., are widely used. First, the basic concepts of data mining (一) "beer, diaper," a classic case
15、In understanding the concept of data mining, we first look at a "beer diaper" story. Wal-Mart in this story the protagonist is the world's largest retailer, in its thousands of supermarkets across the United States, children diapers and beer actually placed side by side together on the prices of goo
16、ds in the nearby sales, and both have pretty good sales . Wal-Mart through the establishment of the original data warehouse, analysis of the original transaction data, according to product sales cycle, statistical information, and then use data mining tools for analysis and digging and found that Wa
17、l-Mart supermarket chains weekends the sales of beer and diapers is very large. Further investigation showed that families with children in the United States, his wife often asked their husbands to work and after going to buy diapers for their children, while their husbands were in the After buying
18、diapers then smoothly back to his own beer drinkers, beer and diapers together with the opportunity to buy the most. After the store to break the routine will be placed on the shelves of beer and diapers together, making beer and diaper sales to grow further. Beer and diapers these two seemingly unr
19、elated, but in certain conditions, there is a close relationship between them, which is data mining techniques. .. (二) The concept of data mining Data Mining (Data Mining) is from vast amounts of raw data, to identify implicit in them, however, we do not know, but is potentially
20、 significant knowledge and information in order to use this knowledge to guide our activities. From the statistical point of view, data mining can be seen as a large number of the complex through computer automated exploratory data analysis. With the rapid development of information technology, it i
21、s a sharp increase in the amount of data accumulated. Data Mining is to comply with this need came into being developed data processing techniques. Second, the retail application data mining background Retail customer relationship management ((Customer Relationship Management. CRM) is a
22、customer-centric marketing concepts and strategies. CRM objective is to reduce the sales cycle and marketing costs, increase revenue, expand their business needed to find new markets and channels and enhance customer price, satisfaction, profitability and loyalty. retail customer relationship manage
23、ment, primarily through bar codes, sales management system, customer data management system for a variety of ways to obtain information on product information, customer information, supplier information and shops information a lot of data, how to use the mass data analysis of which items will sell,
24、what products do not sell, what customers suitable for what commodities, how the mix between, is to enable retailers headache. use of data mining tools an analysis of these data can help retailers to scientific decision-making, an analysis of what products customers purchase together with the most p
25、romising, thus placing these products together; analysis of product sales trends, thus providing retailers purchase recommendations; analysis of purchase commodities, personnel information to help retailers choose the location of the shop and so on. 3, data mining techniques commonly used
26、 algorithm for Data mining is the core technology in the retail CRM, through the analysis of customers who have purchased goods and the intrinsic link between these products to determine the customer's buying habits and tendencies associated with buying, helping retailers to develop marketing
27、strategies. In order to achieve in the retail sector .. CRM application, data mining technology, mainly related to the following commonly used algorithms: .. (一) clustering analysis algorithm Clustering analysis algorithm is based on the characteristics of things, their clustering or cla
28、ssification, that is, the so-called feather flock together, with a view from the laws and the typical patterns found. In the retail sector, the cluster analysis can help the market analysts to distinguish from the consumer database to a different consumer groups, and summarized for each category of
29、consumer spending patterns or habits. .. (二) Decision Tree Algorithm Decision tree algorithm is the use of the training set to generate a test function, according to different values of the establishment of a branch of the tree; in each branch of the creation of duplicate subset of the l
30、ower nodes and branches, thus generating a decision tree. Then the decision tree for pruning treatment and finally the decision tree into a rule. Decision tree algorithm is commonly used in the prediction model, which has the purpose of large amounts of data by classification, to find some valuable
31、potential information. Classify it fast, especially for large-scale data classification. .. (三) neural network algorithm Neural network algorithms to simulate the human neuronal function, through the input layer, hidden layer, output layer and so on, to adjust the data to calculate the f
32、inal yield results. Neural network algorithm has the advantage that it can accurately predict the complex issues. Itself has a good robustness, adaptive and highly fault-tolerant. .. Http:// (四) association rule mining algorithm Association rule mining is used to detect the correlation b
33、etween attributes in the database connection algorithm. Association rule discovery is the essence of the task was found in the database, strong association rules, use of these association rules to understand customer behavior, the most typical example is market basket analysis. 4, data mining
34、technology in the retail application With the growing rise of Web or e-commerce methods, retail CRM is the main application areas of data mining. Data mining technology can help to identify customer buying behavior, found that customer buying patterns and trends, improve service quality, achie
35、ve better customer retention and satisfaction, improve goods sales ratio, design better products transportation and distribution strategy to reduce business costs. Data Mining Application of CRM in the retail industry is mainly reflected in the following areas: (一) The use of multi-feature data
36、cube for sales, customer, product, time and regional multi-dimensional analysis of Multidimensional data analysis is the way through the multi-dimensional analysis of the data, query and reporting. Dimension is one specific point of observation data. For example, companies considering the sale
37、 of products, usually from the customer, product, time and regional perspective of insight into the different product sales. Here customers, products, time and area is the dimension. According to different combinations of these dimensions and the study of metrics from the customer found in the basic
38、 library of different customer base, so that decision-makers according to the characteristics of the main customer base, accordingly, orders, sales and service decision-making. .. Http:// (二) the use of correlation analysis of information to make a purchase recommendation for mining associatio
39、n and commodity reference Correlation analysis is the use of association rules, data mining techniques, aimed at the hidden relationships between the data found in the database form, such as .. "90% of customers in a purchasing activity to purchase A purchase of merchandise B products will als
40、o "The kind of knowledge. Sales records from the Mining Association information, you can find customers to buy a particular brand is likely to purchase other commodities. Such information can be used to form a definite buy recommendation. Businesses through advocacy to improve services to help custo
41、mers choose products, increase sales and reduce inventories. (三) The use of multi-dimensional analysis and correlation analysis to analyze the effectiveness of promotional activities The use of multi-dimensional analysis and correlation analysis examined data from the database to analyze
42、 customer buying habits, advertising success rate, and other strategic information. By searching the database using a database of sales data in recent years, using multi-dimensional correlation analysis method, by comparing the sales volume of sales and number of transactions during the period and p
43、romotional activities before and after the situation, predictable seasonal and monthly sales, variety of goods and inventory trends analysis could also determine the bargains, and the number and operation of the decision-making. Moreover, the correlation analysis can find out which products can be u
44、sed for promotional activities, to facilitate arrangements for supply of goods and improve sales. (四) sequential pattern mining can be used for customer loyalty analysis Sequential pattern analysis and correlation analysis of similar, but the focus is on analysis of data before and after
45、 the sequence of the relationship between. Sequential pattern mining can be used to analyze the customer's loyalty to the changes in consumption or by which the price and variety of goods to be adjusted to retain old customers and attract new customers, to guarantee a certain number of customers. Me
46、rchants from the original client but later converted to competitor's customer base, analyze its characteristics, then the results of the analysis to existing customer data to identify possible shift in customers, and then devise methods to prevent the loss of customers; also according to customer's
47、Consumer behavior and transaction records to sort of customer loyalty, according to the level of wastage and thus with different strategies. (五) the use of cross-selling model to sell existing customers new products or services Retail and customer relationship is an ongoing, devel
48、opmental, and cross-selling is the selling point to customers a new product or service process. Cross-selling is based on buyers and sellers based on the principle of mutual benefit, the customer due to get more and better services to meet their needs benefit, enterprises can also benefit from the r
49、esult of sales growth. Cross-selling advantage, businesses can more easily get more customers a wealth of information. Enterprise master customer information, especially the information prior to purchase, may be decided that the customers with the next purchase of critical information. At this time reflects the role of data mining, it c






