68780 Apache Spark 14 heures Why Spark? Problems with Traditional Large-Scale Systems Introducing Spark Spark Basics What is Apache Spark? Using the Spark Shell Resilient Distributed Datasets (RDDs) Functional Programming with Spark Working with RDDs RDD Operations Key-Value Pair RDDs MapReduce and Pair RDD Operations The Hadoop Distributed File System Why HDFS? HDFS Architecture Using HDFS Running Spark on a Cluster Overview A Spark Standalone Cluster The Spark Standalone Web UI Parallel Programming with Spark RDD Partitions and HDFS Data Locality Working With Partitions Executing Parallel Operations Caching and Persistence RDD Lineage Caching Overview Distributed Persistence Writing Spark Applications Spark Applications vs. Spark Shell Creating the SparkContext Configuring Spark Properties Building and Running a Spark Application Logging Spark, Hadoop, and the Enterprise Data Center Overview Spark and the Hadoop Ecosystem Spark and MapReduce Spark Streaming Spark Streaming Overview Example: Streaming Word Count Other Streaming Operations Sliding Window Operations Developing Spark Streaming Applications Common Spark Algorithms Iterative Algorithms Graph Analysis Machine Learning Improving Spark Performance Shared Variables: Broadcast Variables Shared Variables: Accumulators Common Performance Issues
mdlmrah Model MapReduce and Apache Hadoop 14 heures The course is intended for IT specialist that works with the distributed processing of large data sets across clusters of computers. Data Mining and Business Intelligence Introduction Area of application Capabilities Basics of data exploration Big data What does Big data stand for? Big data and Data mining MapReduce Model basics Example application Stats Cluster model Hadoop What is Hadoop Installation Configuration Cluster settings Architecture and configuration of Hadoop Distributed File System Console tools DistCp tool MapReduce and Hadoop Streaming Administration and configuration of Hadoop On Demand Alternatives
3119 Business Intelligence in MS SQL Server 2008 14 heures Training is dedicated to the basics of create a data warehouse environment based on MS SQL Server 2008. Course participant gain the basis for the design and construction of a data warehouse that runs on MS SQL Server 2008. Gain knowledge of how to build a simple ETL process based on the SSIS and then design and implement a data cube using SSAS. The participant will be able to manage OLAP database: create and delete database OLAP Processing a partition changes on-line. The participant will acquire knowledge of scripting XML / A and MDX. basis, objectives and application of data warehouse, data warehouse server types base building ETL processes in SSIS basic design data cubes in an Analysis Services: measure group measure dimensions, hierarchies, attributes, development of the project data cubes: measures calculated, partitions, perspectives, translations, actions, KPIs, Build and deploy, processing a partition the base XML / A: Partitioning, processes and overall Incremental, delete partitions, processes of aggregation, base MDX language
powerbi Power BI 14 heures Power BI Architecture Data sources On-premises and online data sources Data transformations + M language Direct connections to selected sources (SQL Server, OLAP) Modeling Relationship between tables (single and multidirectional data filtering) DAX - templates and best practices Introduction to DAX Most commonly used functions and context of calculations Working with the time dimension (including fiscal periods, comparing periods, YTD) Hierarchie parent-child Filtering data relative to hierarchy Popular DAX templates Visualizations Interactive data analysis Select the appropriate visualization Filters, grouping, exclusions Visualization on maps Visualizations using the R language Visualization enhancements (so-called custom visuals) Data Access Management - Row-Level Security Team work and mobile with Power BI Dashboards and reports Q & A mechanism Workspaces Mobile applications    

Les Formations à Venir

FormationDate FormationPrix [A distance / Classe]
Business Intelligence in MS SQL Server 2008 - Strasbourg, station Kibitzenaulun, 2017-06-12 09:302630EUR / 3130EUR
Power BI - Lyon, Gare Lyon Part-Dieumar, 2017-06-13 09:301410EUR / 1910EUR
Apache Spark - Poitiers, centre villemar, 2017-06-13 09:304530EUR / 5030EUR
Model MapReduce and Apache Hadoop - Toulouse, centre villelun, 2017-06-26 09:302010EUR / 2510EUR
Business Intelligence in MS SQL Server 2008 - Amiens, Centre Villemar, 2017-06-27 09:302630EUR / 3130EUR

