Apache Hivemall, a collection of machine-learning-related Hive user-defined functions (UDFs), offers Spark integration as documented here. Now, we will see how it works in PySpark. Note that Hivemall requires Spark 2.1+. This article particularly uses Spark 2.3 and Hivemall 0.5.2, and the entire contents are available at this Google Colabo notebook. Installation We do need to set up Spark and Hado
{{#tags}}- {{label}}
{{/tags}}