Summer Certification Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code = getmirror

Pass the Databricks Certification Databricks-Certified-Professional-Data-Scientist Questions and answers with ExamsMirror

Practice at least 50% of the questions to maximize your chances of passing.
Exam Databricks-Certified-Professional-Data-Scientist Premium Access

View all detail and faqs for the Databricks-Certified-Professional-Data-Scientist exam


759 Students Passed

90% Average Score

98% Same Questions
Viewing page 4 out of 5 pages
Viewing questions 31-40 out of questions
Questions # 31:

A data scientist is asked to implement an article recommendation feature for an on-line magazine.

The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine's articles are stored in a database in a format suitable for analytics.

Which method should the data scientist try first?

Options:

A.

K Means Clustering

B.

Naive Bayesian

C.

Logistic Regression

D.

Association Rules

Questions # 32:

Select the correct statement which applies to logistic regression

Options:

A.

Computationally inexpensive, easy to implement knowledge representation easy to interpret

B.

May have low accuracy

C.

Works with Numeric values

Questions # 33:

Assume some output variable "y" is a linear combination of some independent input variables "A" plus some independent noise "e". The way the independent variables are combined is defined by a parameter vector B y=AB+e where X is an m x n matrix. B is a vector of n unknowns, and b is a vector of m values. Assuming that m is not equal to n and the columns of X are linearly independent, which expression correctly solves for B?

Question # 33

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Questions # 34:

You are asked to create a model to predict the total number of monthly subscribers for a specific magazine. You are provided with 1 year's worth of subscription and payment data, user demographic data, and 10 years worth of content of the magazine (articles and pictures). Which algorithm is the most appropriate for building a predictive model for subscribers?

Options:

A.

Linear regression

B.

Logistic regression

C.

Decision trees

D.

TF-IDF

Questions # 35:

While working with Netflix the movie rating websites you have developed a recommender system that has produced ratings predictions for your data set that are consistently exactly 1 higher for the user-item pairs in your dataset than the ratings given in the dataset. There are n items in the dataset. What will be the calculated RMSE of your recommender system on the dataset?

Options:

A.

1

B.

2

C.

0

D.

n/2

Questions # 36:

Select the statement which applies correctly to the Naive Bayes

Options:

A.

Works with a small amount of data

B.

Sensitive to how the input data is prepared

C.

Works with nominal values

Questions # 37:

You are doing advanced analytics for the one of the medical application using the regression and you have two variables which are weight and height and they are very important input variables, which cannot be ignored and they are also highly co-related. What is the best solution for that?

Options:

A.

You will take cube root of height

B.

You will take square root of weight

C.

You will take square of the height.

D.

You would consider using BMI (Body Mass Index)

Questions # 38:

A bio-scientist is working on the analysis of the cancer cells. To identify whether the cell is cancerous or not, there has been hundreds of tests are done with small variations to say yes to the problem. Given the test result for a sample of healthy and cancerous cells, which of the following technique you will use to determine whether a cell is healthy?

Options:

A.

Linear regression

B.

Collaborative filtering

C.

Naive Bayes

D.

Identification Test

Questions # 39:

The method based on principal component analysis (PCA) evaluates the features according to

Options:

A.

The projection of the largest eigenvector of the correlation matrix on the initial dimensions

B.

According to the magnitude of the components of the discriminate vector

C.

The projection of the smallest eigenvector of the correlation matrix on the initial dimensions

D.

None of the above

Questions # 40:

In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?

Options:

A.

Discovery

B.

Data Preparation

C.

Model Building

D.

Communicate Results

Viewing page 4 out of 5 pages
Viewing questions 31-40 out of questions
TOP CODES

TOP CODES

Top selling exam codes in the certification world, popular, in demand and updated to help you pass on the first try.