Course Outline

Introduction to Apache Kylin

  • Overview of OLAP and its significance in big data analytics
  • Evolution of Apache Kylin and its architecture
  • Key features and capabilities of Kylin 50

Setting Up Apache Kylin

  • Installation prerequisites and environment setup
  • Configuring Kylin with Hadoop, Spark, and Kafka
  • Understanding Kylin's web UI and command-line tools

Data Modeling in Kylin

  • Designing star and snowflake schemas for OLAP cubes
  • Defining dimensions and measures
  • Creating and managing data models in Kylin's web UI

Building and Managing Cubes

  • Cube building process and job management
  • Incremental builds and auto-merge strategies
  • Monitoring cube health and performance

Real-Time Streaming with Kylin

  • Integrating Kafka as a streaming data source
  • Setting up real-time cubes and fusion models
  • Achieving low-latency analytics with streaming data

Querying and Analysis

  • Executing SQL queries using Kylin's query interface
  • Connecting BI tools (eg, Tableau, Power BI) to Kylin
  • Performing multidimensional analysis and drill-downs

Performance Optimization

  • Best practices for cube design and aggregation
  • Resource management and tuning for scalability
  • Troubleshooting common performance issues

Advanced Topics

  • Security and access control in Kylin
  • Extending Kylin with custom plugins and integrations
  • Exploring Kylin's REST APIs for automation

Summary and Next Steps

Requirements

  • An understanding of Hadoop and big data ecosystems
  • Familiarity with SQL and data warehousing concepts
  • Basic knowledge of streaming data platforms like Kafka

Audience

  • Big data engineers seeking to implement real-time analytics solutions
  • Data analysts aiming to leverage OLAP capabilities on large datasets
  • Data warehouse architects interested in modernizing their infrastructure
 14 Hours

Testimonials (5)

Upcoming Courses

Related Categories