Unsupervised data mining | Business & Finance homework help

BUA 6315: Business Analytics for Decision Making

1

Overview:

Module 6 Assignment Handout:

Unsupervised Data Mining

In this assignment, you will learn how to apply three unsupervised data mining techniques using

country-level health and population measures data and social media usage patterns data.

Prompt:

For this assignment, you will analyze the three case studies below and address the questions associated

with each.

For all cases, first partition data sets into 50% training, 30% validation, and 20% test and use 12345 as the

default random seed. If the predictor variable values are in the character format, then treat the predictor

variable as a categorical variable. Otherwise, treat the predictor variable as a numerical variable.

Case 1:

For this case, first download the data: Health Population data (available in Blackboard).

Next review the following case study:

The data set Health Population contains country-level health and population measures for 38 countries from

the World Bank’s 2000 Health Nutrition and Population Statistics database. For each country, the measures

include death rates per 1,000 people (Death Rate, %), health expenditure per capita (Health Expend, in

US$), life expectancy at birth (Life Exp, in years), male adult mortality rate per 1,000 male adults (Male

Mortality), female adult mortality rate per 1,000 female adults (Female Mortality), annual population growth

(Population Growth, in %), female population (Female Pop, in %), male population (Male Pop, in %), total

population (Total Pop), size of labor force (Labor Force), births per woman (Fertility Rate), birth rate per

1,000 people (Birth Rate), and gross national income per capita (GNI, in US$).

Then complete the actions below and record your answers in a Microsoft Word document.

Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with

both clustering methods and how to interpret results, refer to the following videos from the module’s lesson:

● Hierarchical Cluster Analysis – Introduction (5:14)

● Using Analytic Solver to Perform Agglomerative Clustering (4:42)

● Using Analytic Solver to Perform K-Means Clustering (3:15)

Section 1: Hierarchical Clustering:

1. Perform agglomerative clustering to group 38 countries according to their health measures listed

below. Use the Euclidean distance and the average linkage clustering method (Group average

linkage) to cluster the data into three clusters. Is data standardization necessary in this case?

Explain.

Use the following measures only: Death Rate, Health Expend, Life Exp, Male Mortality, and

BUA 6315: Business Analytics for Decision Making

3

2. Report the size and the average GNI per capita of each group.

Section 2: K-Means Clustering:

3. Perform K-Means clustering to group 38 countries according to their health measures listed below

to cluster the data into three clusters. Is data standardization necessary in this case? Explain.

Use the following measures only: Death Rate, Health Expend, Life Exp, Male Mortality, and

Female Mortality, Fertility Rate and Birth Rate

BUA 6315: Business Analytics for Decision Making

4

4. Describe the characteristics of each cluster by comparing the averages of GNI per capita,

Population Growth, Labor Force, Fertility Rate and Birth Rate of each group and report your findings

in a table.

Case 2:

For this case, first download the data: Social Media Usage data (available in Blackboard).

Next review the following case study:

Adrian Brown is a researcher studying social media usage patterns. In his research, he noticed that people

tend to use multiple social media applications, and he wants to find out which popular social media

applications are often used together by the same user. He surveyed 100 users about which social media

applications they use on a regular basis.

Then complete the actions below and record your answers in a Microsoft Word document.

Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with

association rule and how to interpret results, refer to the following videos from the module’s lesson:

● Association Rule Analysis (9:11)

● Using Analytic Solver to Perform Association Rule Analysis (3”41)

BUA 6315: Business Analytics for Decision Making

5

1. When you select the data ignore User ID and select the data with labels. Do not forget to select

“First Row Contains Headers” option. Generate association rules with a minimum support of 20

and minimum confidence of 60%. How many rules are generated?

2. Sort the rules by lift ratio. What is the top rule? Report and interpret the lift ratio of the top rule.

Submission Guidelines:

Your completed assignment must be submitted as a Microsoft Word document, 1-2 pages in length, double

spacing, 12-point Times New Roman font, and 1-inch margins. The submission must be accompanied by

three Microsoft Excel spreadsheets showing your work. Only the Word document will be assessed for

grading purposes, however the case spreadsheets are required and must be submitted to show your work.

Note: No tables or charts need to be included in your Word document for this assignment.

Note About Grading:

This assignment will be assessed based on the accuracy of your responses to each question in the

worksheet.

Calculate Your Essay Price
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more