apache-spark Spark DataFrame

30% OFF - 9th Anniversary discount on Entity Framework Extensions until December 15 with code: ZZZANNIVERSARY9

Introduction

A DataFrame is an abstraction of data organized in rows and typed columns. It is similar to the data found in relational SQL-based databases. Although it has been transformed into just a type alias for Dataset[Row] in Spark 2.0, it is still widely used and useful for complex processing pipelines making use of its schema flexibility and SQL-based operations.


Got any apache-spark Question?