Get in Touch

Course Outline

Quick Overview

  • Data Sources
  • Data Management
  • Recommender systems
  • Target Marketing

Data Types

  • Structured vs. unstructured
  • Static vs. streamed
  • Attitudinal, behavioural, and demographic data
  • Data-driven vs. user-driven analytics
  • Data validity
  • Volume, velocity, and variety of data

Models

  • Building models
  • Statistical Models
  • Machine learning

Data Classification

  • Clustering
  • k-Nearest Neighbours, k-means
  • Ant colonies, birds flocking

Predictive Models

  • Decision trees
  • Support vector machine
  • Naive Bayes classification
  • Neural networks
  • Markov Model
  • Regression
  • Ensemble methods

Return on Investment (ROI)

  • Benefit-to-cost ratio
  • Software costs
  • Development costs
  • Potential benefits

Building Models

  • Data Preparation (MapReduce)
  • Data cleansing
  • Choosing methods
  • Model development
  • Model testing
  • Model evaluation
  • Model deployment and integration

Overview of Open Source and Commercial Software

  • Selection of R-project packages
  • Python libraries
  • Hadoop and Mahout
  • Selected Apache projects related to Big Data and Analytics
  • Selected commercial solutions
  • Integration with existing software and data sources

Requirements

Familiarity with traditional data management and analysis methods such as SQL, data warehouses, business intelligence, OLAP, etc. A solid understanding of basic statistics and probability concepts (e.g., mean, variance, probability, conditional probability) is required.

 21 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories