UCM Week 16 Introduction to Big Data Analytics Discussion

Question Description

Week 16 – Discussion Assignment

Discussion Requirements: (Points = 20)

Summative Discussion Board

Initial Post: Review and reflect on the knowledge you have gained from this course. Based on your review and reflection, write at least 3 paragraphs on the following:

  • What were the most compelling topics learned in this course?
  • How did participating in discussions help your understanding of the subject matter?
  • What approaches could have yielded additional valuable information?
  • The main post should include at least 1 reference to research sources, and all sources should be cited using APA format.

Responses to Other Students: Respond to at least 2 of your fellow classmates with at least a 100-word reply about his or her Primary Task Response regarding items you found to be compelling and enlightening.

Note: Participants must create a thread in order to view other threads in this forum.

Your comments should extend the conversation started with the thread.

All original posts and comments must be substantive. (original post deliverable length is about 250 – 300 words). All sources should be cited according to APA guidelines.

Unformatted Attachment Preview

89 80 2 2 7 Week 2 – Discussion (Chapter 1- Week 2 – Discussion Assignment 95 79 1 2) Reading: Chapter 1, “Introduction to Big Data Analytics” 91 77 0 Discussion Requirements: (Points = 20) Participants must create a thread in order to view other threads in this forum. Chapter 2, “Data Analytics Lifecycle” Big Data is data whose scale, distribution, diversity, and timeliness require the use of new technical architectures and analytics to enable insights that unlock new sources of business value. Initial Post: (1) Create a new thread and Name a few companies that dominate their sector through data science and data analytics. Mention on what basis are these companies effective in terms of data analytics. (2) Chapter 1 discussed driving data deluge. Explain by giving example the concept of data deluge using four sources such as mobile sensors, social media, video surveillance, video rendering, smart grids, geophysical exploration, medical imaging, and gene sequencing? (3) Chapter 2 described the data analytics lifecycle which is an approach to managing and executing an analytical project. What are the benefits of doing a pilot program before a full-scale rollout of a new analytical methodology? Discuss this in the context of the mini-case study of your selected organization. Peer Response: Select at least two (2) other students’ threads and post substantive comments on those threads. Your comments should extend the conversation started with the thread. Note: All original posts and comments must be substantive. (original post deliverable length is about 250 – 300 words). All sources should be cited according to APA guidelines. Week 5 Discussion (Chapter 5) Week 5 – Discussion Assignment Discussion Requirements: (Points = 20) Participants must create a thread in order to view other threads in this forum. Read: Chapter 5 – Advanced Analytical Theory and Methods: Association Rules Initial Post: A local retailer has a database that stores 10,000 transactions of last summer. After analyzing the data, a data science team has identified the following statistics: • • • • • • • • 1. 2. 3. 4. 5. {battery} appears in 6,000 transactions. {sunscreen} appears in 5,000 transactions. {sandals} appears in 4,000 transactions. {bowls} appears in 2,000 transactions. {battery,sunscreen} appears in 1,500 transactions. {battery,sandals} appears in 1,000 transactions. {battery,bowls} appears in 250 transactions. {battery,sunscreen,sandals} appears in 600 transactions. Provide response to the following questions: What are the support values of the preceding itemsets? Assuming the minimum support is 0.05, which itemsets are considered frequent? What are the confidence values of {battery}→{sunscreen} and {battery,sunscreen}→{sandals}? Which of the two rules is more interesting? List all the candidate rules that can be formed from the statistics. Which rules are considered interesting at the minimum confidence 0.25? Out of these interesting rules, which rule is considered the most useful (that is, least coincidental)? Conduct library research and identify about three types of an algorithm that uncovers relationships among items and association rules. Compare the identified algorithm with the Apriori algorithm and properties. Also, include their pros and cons. Peer Response: Select at least two (2) other students’ threads and post substantive comments on those threads. Your comments should extend the conversation started with the thread. Note: All original posts and comments must be substantive. (original post deliverable length is about 250 – 300 words). All sources should be cited according to APA guidelines. Week 13 Discussion (Chapter Discussion Requirements: (Points = 20) Participants must create a thread in order to view other threads in this forum. 11) Read Chapter 11 – “Adv. Anal. – Technology and Tools: In-Database Analytics” Chapter 11 discussed the MADlib which is an open-source library for scalable in-database analytics. It offers data-parallel implementations of mathematical, statistical, and machine learning methods for structured and unstructured data. The concept of Magnetic/Agile/Deep (MAD) analysis skills. • o Magnetic: Traditional Enterprise Data Warehouse (EDW) approaches “repel” new data sources, discouraging their incorporation until they are carefully cleaned and integrated. 92 o o 1. 2. 3. 4. Agile: Data Warehousing orthodoxy is based on long-range and careful design and planning. Deep: Modern data analyses involve increasingly sophisticated statistical methods that go well beyond the rollups and drill-downs of traditional business intelligence (BI). Initial Post: Research and identify three companies where each utilize one of the Traditional Enterprise Data Warehouse (EDW) approaches which apply to the MAD concept of (a) Magnetic (b) Agile, (c) Deep analysis skills. Provide examples of each MAD concepts with examples. Compare and contracts the three companies identified their impact on the MAS concept they implemented. Support your ideas and examples with applicable outside sources according to APA guidelines. Peer Response: Select at least two (2) other students’ threads and post substantive comments on those threads. Your comments should extend the conversation started with the thread. Note: All original posts and comments must be substantive. (original post deliverable length is about 250 – 300 words). All sources should be cited according to APA guidelines. …
Purchase answer to see full attachment