Business intelligence And Analytics System For Decision Support 10th Edition by Sharda - Test Bank

Business intelligence And Analytics System For Decision Support 10th Edition by Sharda – Test Bank

Instant Download – Complete Test Bank With Answers

Sample Questions Are Posted Below

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)

Chapter 5 Data Mining

1) In the Cabela’s case study, the SAS/Teradata solution enabled the direct marketer to better identify likely customers and market to them based mostly on external data sources.

Answer: FALSE

Diff: 2 Page Ref: 187-188

2) The cost of data storage has plummeted recently, making data mining feasible for more firms.

Answer: TRUE

Diff: 2 Page Ref: 190

3) Data mining can be very useful in detecting patterns such as credit card fraud, but is of little help in improving sales.

Answer: FALSE

Diff: 2 Page Ref: 190

4) The entire focus of the predictive analytics system in the Infinity P&C case was on detecting and handling fraudulent claims for the company’s benefit.

Answer: FALSE

Diff: 3 Page Ref: 192

5) If using a mining analogy, “knowledge mining” would be a more appropriate term than “data mining.”

Answer: TRUE

Diff: 2 Page Ref: 192

6) Data mining requires specialized data analysts to ask ad hoc questions and obtain answers quickly from the system.

Answer: FALSE

Diff: 2 Page Ref: 194

7) Ratio data is a type of categorical data.

Answer: FALSE

Diff: 1 Page Ref: 195

8) Interval data is a type of numerical data.

Answer: TRUE

Diff: 1 Page Ref: 195

9) In the Memphis Police Department case study, predictive analytics helped to identify the best schedule for officers in order to pay the least overtime.

Answer: FALSE

Diff: 1 Page Ref: 196

10) In data mining, classification models help in prediction.

Answer: TRUE

Diff: 2 Page Ref: 199

11) Statistics and data mining both look for data sets that are as large as possible.

Answer: FALSE

Diff: 2 Page Ref: 200

12) Using data mining on data about imports and exports can help to detect tax avoidance and money laundering.

Answer: TRUE

Diff: 1 Page Ref: 203

13) In the cancer research case study, data mining algorithms that predict cancer survivability with high predictive power are good replacements for medical professionals.

Answer: FALSE

Diff: 2 Page Ref: 211

14) During classification in data mining, a false positive is an occurrence classified as true by the algorithm while being false in reality.

Answer: TRUE

Diff: 2 Page Ref: 215

15) When training a data mining model, the testing dataset is always larger than the training dataset.

Answer: FALSE

Diff: 2 Page Ref: 215

16) When a problem has many attributes that impact the classification of different patterns, decision trees may be a useful approach.

Answer: TRUE

Diff: 2 Page Ref: 218

17) In the 2degrees case study, the main effectiveness of the new analytics system was in dissuading potential churners from leaving the company.

Answer: TRUE

Diff: 2 Page Ref: 222

18) Market basket analysis is a useful and entertaining way to explain data mining to a technologically less savvy audience, but it has little business significance.

Answer: FALSE

Diff: 2 Page Ref: 224

19) The number of users of free/open source data mining software now exceeds that of users of commercial software versions.

Answer: TRUE

Diff: 1 Page Ref: 229

20) Data that is collected, stored, and analyzed in data mining is often private and personal. There is no way to maintain individuals’ privacy other than being very careful about physical data security.

Answer: FALSE

Diff: 2 Page Ref: 234

21) In the Cabela’s case study, what types of models helped the company understand the value of customers, using a five-point scale?

A) reporting and association models
B) simulation and geographical models
C) simulation and regression models
D) clustering and association models

Answer: D

Diff: 3 Page Ref: 108

22) Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from

A) collecting data about customers and transactions.
B) developing a philosophy that is data analytics-centric.
C) analyzing the vast data amounts routinely collected.
D) asking the customers what they want.

Answer: C

Diff: 3 Page Ref: 190

23) All of the following statements about data mining are true EXCEPT

A) the process aspect means that data mining should be a one-step process to results.
B) the novel aspect means that previously unknown patterns are discovered.
C) the potentially useful aspect means that results should lead to some business benefit.
D) the valid aspect means that the discovered patterns should hold true on new data.

Answer: A

Diff: 3 Page Ref: 193

24) What is the main reason parallel processing is sometimes used for data mining?

A) because the hardware exists in most organizations and it is available to use
B) because the most of the algorithms used for data mining require it
C) because of the massive data amounts and search efforts involved
D) because any strategic application requires parallel processing

Answer: C

Diff: 3 Page Ref: 193

25) The data field “ethnic group” can be best described as

A) nominal data.
B) interval data.
C) ordinal data.
D) ratio data.

Answer: A

Diff: 2 Page Ref: 195

26) The data field “salary” can be best described as

A) nominal data.
B) interval data.
C) ordinal data.
D) ratio data.

Answer: D

Diff: 2 Page Ref: 195

27) Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes?

A) associations
B) visualization
C) classification
D) clustering

Answer: C

Diff: 2 Page Ref: 199

28) Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features?

A) associations
B) visualization
C) classification
D) clustering

Answer: D

Diff: 2 Page Ref: 199

29) The data mining algorithm type used for classification somewhat resembling the biological neural networks in the human brain is

A) association rule mining.
B) cluster analysis.
C) decision trees.
D) artificial neural networks.

Answer: D

Diff: 3 Page Ref: 199

30) Identifying and preventing incorrect claim payments and fraudulent activities falls under which type of data mining applications?

A) insurance
B) retailing and logistics
C) customer relationship management
D) computer hardware and software

Answer: A

Diff: 2 Page Ref: 202

31) All of the following statements about data mining are true EXCEPT

A) understanding the business goal is critical.
B) understanding the data, e.g., the relevant variables, is critical to success.
C) building the model takes the most time and effort.
D) data is typically preprocessed and/or cleaned before use.

Answer: C

Diff: 3 Page Ref: 205-208

32) Which data mining process/methodology is thought to be the most comprehensive, according to kdnuggets.com rankings?

A) SEMMA
B) proprietary organizational methodologies
C) KDD Process
D) CRISP-DM

Answer: D

Diff: 2 Page Ref: 213

33) Prediction problems where the variables have numeric values are most accurately defined as

A) classifications.
B) regressions.
C) associations.
D) computations.

Answer: B

Diff: 3 Page Ref: 214

34) What does the robustness of a data mining method refer to?

A) its ability to predict the outcome of a previously unknown data set accurately
B) its speed of computation and computational costs in using the mode
C) its ability to construct a prediction model efficiently given a large amount of data
D) its ability to overcome noisy data to make somewhat accurate predictions

Answer: D

Diff: 3 Page Ref: 214

35) What does the scalability of a data mining method refer to?

A) its ability to predict the outcome of a previously unknown data set accurately
B) its speed of computation and computational costs in using the mode
C) its ability to construct a prediction model efficiently given a large amount of data
D) its ability to overcome noisy data to make somewhat accurate predictions

Answer: C

Diff: 3 Page Ref: 214

36) In estimating the accuracy of data mining (or other) classification models, the true positive rate is

A) the ratio of correctly classified positives divided by the total positive count.
B) the ratio of correctly classified negatives divided by the total negative count.
C) the ratio of correctly classified positives divided by the sum of correctly classified positives and incorrectly classified positives.
D) the ratio of correctly classified positives divided by the sum of correctly classified positives and incorrectly classified negatives.

Answer: A

Diff: 2 Page Ref: 216

37) In data mining, finding an affinity of two products to be commonly together in a shopping cart is known as

A) association rule mining.
B) cluster analysis.
C) decision trees.
D) artificial neural networks.

Answer: A

Diff: 2 Page Ref: 224

38) Third party providers of publicly available datasets protect the anonymity of the individuals in the data set primarily by

A) asking data users to use the data ethically.
B) leaving in identifiers (e.g., name), but changing other variables.
C) removing identifiers such as names and social security numbers.
D) letting individuals in the data know their data is being accessed.

Answer: C

Diff: 3 Page Ref: 234

39) In the Target case study, why did Target send a teen maternity ads?

A) Target’s analytic model confused her with an older woman with a similar name.
B) Target was sending ads to all women in a particular neighborhood.
C) Target’s analytic model suggested she was pregnant based on her buying habits.
D) Target was using a special promotion that targeted all teens in her geographical area.

Answer: C

Diff: 2 Page Ref: 235-236

40) Which of the following is a data mining myth?

A) Data mining is a multistep process that requires deliberate, proactive design and use.
B) Data mining requires a separate, dedicated database.
C) The current state-of-the-art is ready to go for almost any business.
D) Newer Web-based tools enable managers of all educational levels to do data mining.

Answer: B

Diff: 2 Page Ref: 236

41) In the opening vignette, Cabela’s uses SAS data mining tools to create ________ models to optimize customer selection for all customer contacts.

Answer: predictive

Diff: 2 Page Ref: 187

42) There has been an increase in data mining to deal with global competition and customers’ more sophisticated ________ and wants.

Answer: needs

Diff: 2 Page Ref: 190

43) Knowledge extraction, pattern analysis, data archaeology, information harvesting, pattern searching, and data dredging are all alternative names for ________.

Answer: data mining

Diff: 1 Page Ref: 192

44) Data are often buried deep within very large ________, which sometimes contain data from several years.

Answer: databases

Diff: 1 Page Ref: 193

45) ________ represent the labels of multiple classes used to divide a variable into specific groups, examples of which include race, sex, age group, and educational level.

Answer: Categorical data

Diff: 2 Page Ref: 194

46) In the Memphis Police Department case study, shortly after all precincts embraced Blue CRUSH, ________ became one of the most potent weapons in the Memphis police department’s crime-fighting arsenal.

Answer: predictive analytics

Diff: 2 Page Ref: 196

47) Patterns have been manually ________ from data by humans for centuries, but the increasing volume of data in modern times has created a need for more automatic approaches.

Answer: extracted

Diff: 2 Page Ref: 197

48) While prediction is largely experience and opinion based, ________ is data and model based.

Answer: forecasting

Diff: 2 Page Ref: 198

49) Whereas ________ starts with a well-defined proposition and hypothesis, data mining starts with a loosely defined discovery statement.

Answer: statistics

Diff: 2 Page Ref: 200

50) Customer ________ management extends traditional marketing by creating one-on-one relationships with customers.

Answer: relationship

Diff: 2 Page Ref: 201

51) In the terrorist funding case study, an observed price ________ may be related to income tax avoidance/evasion, money laundering, or terrorist financing.

Answer: deviation

Diff: 3 Page Ref: 204

52) Data preparation, the third step in the CRISP-DM data mining process, is more commonly known as ________.

Answer: data preprocessing

Diff: 2 Page Ref: 206

53) The data mining in cancer research case study explains that data mining methods are capable of extracting patterns and ________ hidden deep in large and complex medical databases.

Answer: relationships

Diff: 3 Page Ref: 210

54) Fayyad et al. (1996) defined ________ in databases as a process of using data mining methods to find useful information and patterns in the data.

Answer: knowledge discovery

Diff: 2 Page Ref: 213

55) In ________, a classification method, the complete data set is randomly split into mutually exclusive subsets of approximately equal size and tested multiple times on each left-out subset, using the others as a training set.

Answer: k-fold cross-validation

Diff: 2 Page Ref: 216

56) The basic idea behind a ________ is that it recursively divides a training set until each division consists entirely or primarily of examples from one class.

Answer: decision tree

Diff: 3 Page Ref: 218

57) As described in the 2degrees case study, a common problem in the mobile telecommunications industry is defined by the term ________, which means customers leaving.

Answer: customer churn

Diff: 2 Page Ref: 221

58) Because of its successful application to retail business problems, association rule mining is commonly called ________.

Answer: market-basket analysis

Diff: 2 Page Ref: 224

59) The ________ is the most commonly used algorithm to discover association rules. Given a set of itemsets, the algorithm attempts to find subsets that are common to at least a minimum number of the itemsets.

Answer: Apriori algorithm

Diff: 2 Page Ref: 226

60) One way to accomplish privacy and protection of individuals’ rights when data mining is by ________ of the customer records prior to applying data mining applications, so that the records cannot be traced to an individual.

Answer: de-identification

Diff: 2 Page Ref: 234

61) List five reasons for the growing popularity of data mining in the business world.

Answer:

More intense competition at the global scale driven by customers’ ever-changing needs and wants in an increasingly saturated marketplace
General recognition of the untapped value hidden in large data sources
Consolidation and integration of database records, which enables a single view of customers, vendors, transactions, etc.
Consolidation of databases and other data repositories into a single location in the form of a data warehouse
The exponential increase in data processing and storage technologies
Significant reduction in the cost of hardware and software for data storage and processing
Movement toward the de-massification (conversion of information resources into nonphysical form) of business practices

Diff: 2 Page Ref: 190

62) What are the differences between nominal, ordinal, interval and ratio data? Give examples.

Answer:

Nominal data contain measurements of simple codes assigned to objects as labels, which are not measurements. For example, the variable marital status can be generally categorized as (1) single, (2) married, and (3) divorced.
Ordinal data contain codes assigned to objects or events as labels that also represent the rank order among them. For example, the variable credit score can be generally categorized as (1) low, (2) medium, or (3) high. Similar ordered relationships can be seen in variables such as age group (i.e., child, young, middle-aged, elderly) and educational level (i.e., high school, college, graduate school).
Interval data are variables that can be measured on interval scales. A common example of interval scale measurement is temperature on the Celsius scale. In this particular scale, the unit of measurement is 1/100 of the difference between the melting temperature and the boiling temperature of water in atmospheric pressure; that is, there is not an absolute zero value.
Ratio data include measurement variables commonly found in the physical sciences and engineering. Mass, length, time, plane angle, energy, and electric charge are examples of physical measures that are ratio scales. Informally, the distinguishing feature of a ratio scale is the possession of a nonarbitrary zero value. For example, the Kelvin temperature scale has a nonarbitrary zero point of absolute zero.

Diff: 2 Page Ref: 194-195

63) List and briefly describe the six steps of the CRISP-DM data mining process.

Answer:

Step 1: Business Understanding – The key element of any data mining study is to know what the study is for. Answering such a question begins with a thorough understanding of the managerial need for new knowledge and an explicit specification of the business objective regarding the study to be conducted.
Step 2: Data Understanding – A data mining study is specific to addressing a well-defined business task, and different business tasks require different sets of data. Following the business understanding, the main activity of the data mining process is to identify the relevant data from many available databases.
Step 3: Data Preparation – The purpose of data preparation (or more commonly called data preprocessing) is to take the data identified in the previous step and prepare it for analysis by data mining methods. Compared to the other steps in CRISP-DM, data preprocessing consumes the most time and effort; most believe that this step accounts for roughly 80 percent of the total time spent on a data mining project.
Step 4: Model Building – Here, various modeling techniques are selected and applied to an already prepared data set in order to address the specific business need. The model-building step also encompasses the assessment and comparative analysis of the various models built.
Step 5: Testing and Evaluation – In step 5, the developed models are assessed and evaluated for their accuracy and generality. This step assesses the degree to which the selected model (or models) meets the business objectives and, if so, to what extent (i.e., do more models need to be developed and assessed).
Step 6: Deployment – Depending on the requirements, the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data mining process across the enterprise. In many cases, it is the customer, not the data analyst, who carries out the deployment steps.

Diff: 2 Page Ref: 205-212

64) Describe the role of the simple split in estimating the accuracy of classification models.

Answer: The simple split (or holdout or test sample estimation) partitions the data into two mutually exclusive subsets called a training set and a test set (or holdout set). It is common to designate two-thirds of the data as the training set and the remaining one-third as the test set. The training set is used by the inducer (model builder), and the built classifier is then tested on the test set. An exception to this rule occurs when the classifier is an artificial neural network. In this case, the data is partitioned into three mutually exclusive subsets: training, validation, and testing.

Diff: 2 Page Ref: 215

65) Briefly describe five techniques (or algorithms) that are used for classification modeling.

Answer:

Decision tree analysis. Decision tree analysis (a machine-learning technique) is arguably the most popular classification technique in the data mining arena.
Statistical analysis. Statistical techniques were the primary classification algorithm for many years until the emergence of machine-learning techniques. Statistical classification techniques include logistic regression and discriminant analysis.
Neural networks. These are among the most popular machine-learning techniques that can be used for classification-type problems.
Case-based reasoning. This approach uses historical cases to recognize commonalities in order to assign a new case into the most probable category.
Bayesian classifiers. This approach uses probability theory to build classification models based on the past occurrences that are capable of placing a new instance into a most probable class (or category).
Genetic algorithms. This approach uses the analogy of natural evolution to build directed-search-based mechanisms to classify data samples.
Rough sets. This method takes into account the partial membership of class labels to predefined categories in building models (collection of rules) for classification problems.

Diff: 2 Page Ref: 218

66) Describe cluster analysis and some of its applications.

Answer: Cluster analysis is an exploratory data analysis tool for solving classification problems. The objective is to sort cases (e.g., people, things, events) into groups, or clusters, so that the degree of association is strong among members of the same cluster and weak among members of different clusters. Cluster analysis is an essential data mining method for classifying items, events, or concepts into common groupings called clusters. The method is commonly used in biology, medicine, genetics, social network analysis, anthropology, archaeology, astronomy, character recognition, and even in MIS development. As data mining has increased in popularity, the underlying techniques have been applied to business, especially to marketing. Cluster analysis has been used extensively for fraud detection (both credit card and e-commerce fraud) and market segmentation of customers in contemporary CRM systems.

Diff: 2 Page Ref: 220

67) In the data mining in Hollywood case study, how successful were the models in predicting the success or failure of a Hollywood movie?

Answer: The researchers claim that these prediction results are better than any reported in the published literature for this problem domain. Fusion classification methods attained up to 56.07% accuracy in correctly classifying movies and 90.75% accuracy in classifying movies within one category of their actual category. The SVM classification method attained up to 55.49% accuracy in correctly classifying movies and 85.55% accuracy in classifying movies within one category of their actual category.

Diff: 3 Page Ref: 233-234

68) In lessons learned from the Target case, what legal warnings would you give another retailer using data mining for marketing?

Answer: If you look at this practice from a legal perspective, you would conclude that Target did not use any information that violates customer privacy; rather, they used transactional data that most every other retail chain is collecting and storing (and perhaps analyzing) about their customers. What was disturbing in this scenario was perhaps the targeted concept: pregnancy. There are certain events or concepts that should be off limits or treated extremely cautiously, such as terminal disease, divorce, and bankruptcy.

Diff: 2 Page Ref: 236

69) List four myths associated with data mining.

Answer:

Data mining provides instant, crystal-ball-like predictions.
Data mining is not yet viable for business applications.
Data mining requires a separate, dedicated database.
Only those with advanced degrees can do data mining.
Data mining is only for large firms that have lots of customer data.

Diff: 2 Page Ref: 236

70) List six common data mining mistakes.

Answer:

Selecting the wrong problem for data mining
Ignoring what your sponsor thinks data mining is and what it really can and cannot do
Leaving insufficient time for data preparation
Looking only at aggregated results and not at individual records
Being sloppy about keeping track of the data mining procedure and results
Ignoring suspicious findings and quickly moving on
Running mining algorithms repeatedly and blindly
Believing everything you are told about the data
Believing everything you are told about your own data mining analysis
Measuring your results differently from the way your sponsor measures them

Diff: 2 Page Ref: 236-237

Business intelligence And Analytics System For Decision Support 10th Edition by Sharda - Test Bank

Description

Additional Information

Reviews

Additional information

Add Review Cancel reply

Basic And Clinical Pharmacology 14th Edition by Katzung Trevor

Nursing Health Assessment A Best Practice Approach 3rd Edition by Jensen

NEW Pediatric Primary Care 4th Edition by Richardson

Bates’ Guide To Physical Examination and History Taking 13th Edition by Bickley

Kaplan And Sadock's Synopsis of Psychiatry 11th Edition by Sadock

Advanced Health Assessment and Diagnostic Reasoning 4th Edition by Rhoads

Quick Links

Contact Information

Business intelligence And Analytics System For Decision Support 10th Edition by Sharda - Test Bank

Description

Additional Information

Reviews

Additional information

Add Review Cancel reply

Related Products

Basic And Clinical Pharmacology 14th Edition by Katzung Trevor

Nursing Health Assessment A Best Practice Approach 3rd Edition by Jensen

NEW Pediatric Primary Care 4th Edition by Richardson

Bates’ Guide To Physical Examination and History Taking 13th Edition by Bickley

Kaplan And Sadock's Synopsis of Psychiatry 11th Edition by Sadock

Advanced Health Assessment and Diagnostic Reasoning 4th Edition by Rhoads

Quick Links

Contact Information