Failed to find data source: avro

2.10K viewsApache Sparkavro spark

Failed to find data source: avro

I am trying to read avro file in spark cli but getting an error as “Failed to find data source”.

Below is the console output,

scala> val df = spark.read.format("avro").load("/file.avro")
org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Please find an Avro package at http://spark.apache.org/third-party-projects.html;
  at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:590)
  at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:86)
  at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:86)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:325)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:135)
  ... 48 elided

Share:
hiberstackers Edited question June 3, 2021
1

You need the databricks avro jar to read an avro file. Execute the below command to download the jar file

wget https://repo1.maven.org/maven2/com/databricks/spark-avro_2.11/4.0.0/spark-avro_2.11-4.0.0.jar

Now add this jar while starting the spark-shell

spark-shell --jars spark-avro_2.11-4.0.0.jar

Now you can read the avro file

val df= spark.read.format("com.databricks.spark.avro").load("/file.avro")

Share:
hiberstackers Answered question February 24, 2021
0
You are viewing 1 out of 1 answers, click here to view all answers.