pyspark.sql.UDFRegistration.registerJavaFunction#

UDFRegistration.registerJavaFunction(name, javaClassName, returnType=None)[source]#

Register a Java user-defined function as a SQL function.

In addition to a name and the function itself, the return type can be optionally specified. When the return type is not specified we would infer it via reflection.

New in version 2.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
namestr

name of the user-defined function

javaClassNamestr

fully qualified name of java class

returnTypepyspark.sql.types.DataType or str, optional

the return type of the registered Java function. The value can be either a pyspark.sql.types.DataType object or a DDL-formatted type string.

Examples

>>> from pyspark.sql.types import IntegerType
>>> spark.udf.registerJavaFunction(
...     "javaStringLength", "test.org.apache.spark.sql.JavaStringLength", IntegerType())
... 
>>> spark.sql("SELECT javaStringLength('test')").collect()  
[Row(javaStringLength(test)=4)]
>>> spark.udf.registerJavaFunction(
...     "javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
... 
>>> spark.sql("SELECT javaStringLength2('test')").collect()  
[Row(javaStringLength2(test)=4)]
>>> spark.udf.registerJavaFunction(
...     "javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", "integer")
... 
>>> spark.sql("SELECT javaStringLength3('test')").collect()  
[Row(javaStringLength3(test)=4)]