Uploaded on Jul 23, 2018
Read the difference between data mining and web mining.
DIFFERENCE BETWEEN DATA MINING AND WEB MINING
DATA MINING AND WEB MINING DIFFERENCE BETWEEN DATA MINING AND WEB MINING What is Data Mining? Data Mining(Knowledge Discovery in Databases)-The process of discovering useful patterns or knowledge from different data sources like databases, texts, images, audio and video and web etc. The patterns must be valid, potentially useful, and understandable. Data mining is a multi-disciplinary field involving machine learning, statistics, databases, artificial intelligence, information retrieval, and visualization. What is Web Mining? The application of data mining techniques to discover patterns from the web and categorical extraction and evaluation with filtered information for knowledge discovery from sophisticated web data and its appropriate web services. It can be divided into three major categories- 1-Web Content Mining (WCM)aims to extract useful information or knowledge from web data contents like text, image, audio, video records etc. 2- Web Structure Mining (WSM) tries to discover useful knowledge from the structure of hyperlinks and tags. 3-Web Usage Mining (WUM) refers to the discovery of user usage logs, http logs, application server logs, etc. DIFFERENCES Comparison Web Mining Data Mining Definition Process used to extract information from web documents. Process used to extract hidden information from the database. Scale It contains 10 million jobs in server database, and therefore search processing is not big. It contains 1 million jobs in database and search processing is large. Who does this? Data scientists Data engineers Data scientists/Data analysts Data engineers Structure The information is obtained from structured, semi-structured and unstructured web forms. It gets the information from wide database. It obtains the information from explicit structure. It is not able to get all the information from wide database as compared to web mining. Comparison Web Mining Data Mining Concept Pattern identification from data available in any systems. Pattern identification from web data. Process Data extraction -> Pattern discovery -> Develop the feature/solve it (Algorithm) Same process but on web using the web documents Access Data is accessed publicly. In this, data is not hidden in web database and only permission is required to access the data from web log master. Data is accessed privately and only authorized user can access the data. Data It works upon on-line data. It works upon off-line data. Data Storage Data is stored in server logs and web server database. Data is stored in data warehouses. Comparison Web Mining Data Mining Techniques Web Content Mining, Graph Based Web Mining, Utilization in Web Mining, Text Mining and many others. Artificial Neural Network, Decision Trees, Rule Induction, Nearest Neighbor Method and many others. Challenges Complexity of web pages, web is too huge, relevancy of information, web is dynamic information source, diversity of user communicates etc. Network settings, data quality, privacy preservation, scalability, complex and heterogeneous data, etc. Tools Machine learning algorithms Scrappy, PageRank, Apache logs How significant?? Many organizations are relying on data science results for decision making. Web-related data pull would influence the existing data mining process. APPLICATION AREAS Data Mining Industry Application Finance Credit Card Analysis Insurance Claims, Fraud Analysis Telecommunication Call record analysis Transport Logistics management Consumer goods promotion analysis Data Service providers Value added data Utilities Power usage analysis Web Mining The most dominant application area for WM is related to Internet based e- commerce (business-to-consumer) and Web-based customer relationship management (CRM) an integral part of E-business today. To discover knowledge for understanding the cause of any disease and its treatment. The business benefits that Web mining affords to digital service providers include personalization , collaborative filtering, enhanced customer support, product and service strategy definition, particle marketing and fraud detection. To track error done by hospital staff and enable them t correct the error and prohibit them to repeat the same in future. To identify some patterns to set the policies for health care centers and hospitals. THANK YOU!!!
Comments